Research


We investigate the cutting edge of computer systems and architecture. We are dedicated to advancing AI performance, revolutionizing cloud application optimization, and pushing the boundaries of next-gen memory solutions. We are also committed to efficient power management and proactive architectural security.

Research Topics:

  1. Artificial Intelligent Systems & Architecture
  2. Cloud Computing & Applications Optimizations
  3. Next-Generation Memory Systems & Architecture
  4. Power/Resource Management for Energy Efficiency of Data-center Servers
  5. Security Vulnerabilities in Computer Systems

Artificial Intelligent Systems & Architecture


Our primary goal is to maximize the performance, scalability, and efficiency of cutting-edge technologies such as large language models (e.g., GPT, LLaMA) and recommendation systems (e.g., DLRM). We are also exploring the Near Memory Processing paradigm to significantly reduce data movement bottlenecks and enhance system throughput. Additionally, we are focused on optimizing the execution of Machine Learning (ML) inference across GPU and CPU resources to improve speed and responsiveness. By implementing these optimization strategies, we aim to deliver faster and more efficient ML inference. Our ultimate objective is not only to create AI systems with robust computational power but also to design systems that provide maximum efficiency and performance across a wide range of applications and scenarios.

Cloud Computing & Applications Optimizations


Our research focuses on enhancing the efficiency of cloud-based systems and applications. A key area of interest is the architecture and optimization of microservice-based applications. We are committed to discovering innovative architectural and software-level approaches to improve the performance, scalability, and resource utilization of these applications. Another significant effort is the development of an integrated management framework designed to ensure Service Level Objective (SLO) guarantees, providing a comprehensive solution for maintaining service quality and performance across diverse cloud services. Additionally, we are deeply involved in optimizing the performance of machine learning inference servers. Through careful analysis and design, we aim to maximize their potential, enabling rapid and accurate inferences across various applications. Our research also extends to optimizing server infrastructures for AI model training. By fine-tuning server configurations, resource allocation, and distributed processing techniques, we aim to create an optimized environment that accelerates the training of complex AI models. Ultimately, our goal is to drive the evolution of cloud computing by developing innovative solutions that streamline application deployment, management, and performance, while addressing the unique challenges of AI and machine learning workloads.

Publications

  1. VIP: Virtual Performance-State for Efficient Power Management of Virtual Machines
  2. Janus: supporting heterogeneous power management in virtualized environments
  3. Virtual Snooping Coherence for Multi-Core Virtualized Systems
  4. vCache: Architectural support for transparent and isolated virtual LLCs in virtualized environments
  5. vCache: Providing a Transparent View of the LLC in Virtualized Environments
  6. Virtual Snooping: Filtering Snoops in Virtualized Multi-cores

Next-Generation Memory Systems & Architecture


We are committed to advancing memory technologies and their associated architectural solutions. A key focus of our work is the development of holistic, disaggregated memory management solutions optimized for memory-centric computing paradigms. By addressing the challenges of distributed memory resources, we aim to design cohesive strategies that ensure efficient memory utilization and seamless data access across interconnected memory units. Additionally, we are innovating Compute Express Link (CXL)-based multi-tiered memory systems for memory-intensive applications. By integrating CXL technology, we aim to create tiered memory architectures that combine high-speed and high-capacity memory units to meet the demands of data-intensive workloads. Our research also enhances the CXL hardware architecture, focusing on accommodating hardware extensions. By designing flexible and extensible hardware frameworks, we seek to integrate novel memory technologies in a way that supports future scalability and innovation. Furthermore, we are exploring heterogeneous memory management, employing techniques such as memory deduplication and compression. These methods aim to optimize memory utilization by reducing redundancy and maximizing the effective capacity of memory resources. Our objective is to shape the future of memory systems and architectures by providing innovative solutions that empower memory-intensive computing, enable efficient data processing, and pave the way for more sophisticated and capable memory technologies.

Publications

  1. Exploiting OS-Level Memory Offlining for DRAM Power Management
  2. Application-Transparent Near-Memory Processing Architecture with Memory Channel Network
  3. Virtual Snooping Coherence for Multi-Core Virtualized Systems
  4. vCache: Architectural support for transparent and isolated virtual LLCs in virtualized environments
  5. vCache: Providing a Transparent View of the LLC in Virtualized Environments
  6. Subspace Snooping: Exploiting Temporal Sharing Stability for Snoop Reduction
  7. Virtual Snooping: Filtering Snoops in Virtualized Multi-cores
  8. Subspace Snooping: Filtering Snoops with Operating System Support

Power/Resource Management for Energy Efficiency of Data-center Servers


Our research in power and energy management is focused on pioneering strategies to optimize power consumption and energy efficiency in computing environments. A key aspect of our work involves leveraging Dynamic Voltage and Frequency Scaling (DVFS) techniques. Through careful analysis and experimentation, we aim to identify optimal voltage and frequency configurations that balance performance and energy consumption, enabling systems to adapt resource usage according to workload variations. Additionally, we are committed to developing robust power and energy metering approaches tailored for consolidated virtual machines. By accurately measuring power usage at the virtual machine level, we enable precise monitoring and management of energy consumption in multi-tenant virtualized environments. Our research also explores Hardware/Software co-design techniques to ensure Quality-of-Service (QoS) guarantees while maintaining low power and energy consumption. This includes the development of innovative algorithms, protocols, and mechanisms that dynamically allocate resources to meet performance targets while minimizing energy use. Overall, our goal is to contribute to the creation of energy-efficient computing systems that deliver high performance while prioritizing sustainable power and energy management, aligning with the growing demand for environmentally conscious and economically viable computing solutions.

Publications

  1. Co-Adjusting Voltage/Frequency State and Interrupt Rate for Improving Energy-Efficiency of Latency-Critical Applications
  2. Exploiting OS-Level Memory Offlining for DRAM Power Management
  3. Network Packet Processing Mode-Aware Power Management for Data Center Servers
  4. Application-Transparent Near-Memory Processing Architecture with Memory Channel Network
  5. VIP: Virtual Performance-State for Efficient Power Management of Virtual Machines
  6. Janus: supporting heterogeneous power management in virtualized environments
  7. NCAP: Network-Driven, Packet Context-Aware Power Management for Client-Server Architecture

Security Vulnerabilities in Computer Systems


We aim to advance security measures and strategies at the architectural level to safeguard computing environments against various threats. A significant part of our work is focused on studying architectural side-channel attacks, as well as developing detection and mitigation techniques. We explore vulnerabilities like Spectre and Meltdown, aiming to understand their underlying mechanisms and create effective countermeasures to prevent sensitive information leakage. Additionally, we are actively working to enhance memory integrity for robust system security. One key area of our research is the Row Hammer phenomenon, where repeated access to a memory row can cause data corruption. By investigating this vulnerability, we seek to design memory protection mechanisms that prevent such attacks and ensure data integrity. Ultimately, our goal is to strengthen the foundations of computer systems and architectures to resist a diverse range of security threats. By staying at the forefront of this research, we aim to contribute to the development of more secure computing environments that can protect against emerging threats in an ever-evolving digital landscape.

Publications

  1. Defending Against Flush+Reload Attack With DRAM Cache by Bypassing Shared SRAM Cache