SMOOTH: Hardware-Assisted Fine-Grained On-Chip Memory Management for Efficient On-Device LLM Inference

Seulki Kim, Bokyeong Kim, Kyeonghyeon Ryu, Yeji Jung, Hwanjun Lee, Sungju Kim, Yunhyeong Jeon, and Daehoon Kim*. IEEE/ACM International Symposium on Computer Architecture (ISCA), 2026

Abstract

Keywords

Accelerator, NPU, LLM inference.

Related Research Topics

Security Vulnerabilities in Computer Systems