Keynotes

Day1, June 22 (Mon.), 2026, 9:00 - 10:30, Room: Grand Ballroom

Title: On Securing Networked Embedded/Cyber-Physical Systems

Abstract: There has been an exponential growth of cyber-physical applications that rely on diverse types of embedded end-systems and devices, such as smart phones/watches/glasses, home appliances, consumer and industrial electronics, smart sensors and actuators. These applications require diverse types of Quality-of-Service (QoS) including timeliness, dependability, security and privacy, from end-systems/devices which are usually networked together via heterogeneous networking technologies and protocols.

We now know how to guarantee timeliness and, to a lesser extent, how to provide fault-tolerance, on both end-systems and their interconnection networks. However, how to secure them is far less known, despite the growing importance of protecting information stored in the end systems/devices and exchanged over their interconnection networks. Moreover, timeliness, fault-tolerance, security and privacy—which I call simply QoS—must be supported simultaneously, often with a tight resource budget such as memory, computation and communication bandwidth, and battery power. Also, different applications require different combinations of QoS components, and hence, one-fits-all solutions are not acceptable. This talk will cover issues and approaches to the problems of securing networked embedded systems.

If time allows, I will discuss our work-in-progress on context-aware autonomous vehicles.

Bio: Kang G. Shin (Life Fellow, IEEE) is currently the Kevin & Nancy O’Connor Professor Emeritus of Computer Science with the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor. His current research focuses on safe and secure embedded real-time and cyber-physical systems and QoS-sensitive computing and networking. He has supervised the completion of 93 Ph.D.’s and authored/co-authored about 1000 technical articles, a textbook and more than 60 patents or invention disclosures, and received numerous awards, including 2000 and 2010 USENIX Annual Technical Conferences, the 2003 IEEE Communications Society William R. Bennett Prize Paper Award and the 1987 Outstanding IEEE Transactions of Automatic Control Paper Award, the Best Paper Awards from 2023 VehicleSec, 2011 ACM International Conference on Mobile Computing and Networking, 2011 IEEE International Conference on Autonomic Computing, 2019 Caspar Bowden Award for Outstanding Research in Privacy Enhancing Technologies, 2023 IEEE TCCPS Technical Achievement Award, 2023 SIGMOBILE Test-of Time Award, and 2026 IEEE TC on Distributed Processing (TCDP) Award for Outstanding Technical Achievement. He has also received several institutional awards, including the Research Excellence Award in 1989, Outstanding Achievement Award in 1999, Distinguished Faculty Achievement Award in 2001, and Stephen Attwood Award in 2004 from The University of Michigan (the highest honor bestowed to Michigan Engineering faculty); a Distinguished Alumni Award of the College of Engineering, Seoul National University, in 2002; 2003 IEEE RTC Technical Achievement Award; and 2006 Ho-Am Prize in Engineering (the highest honor bestowed to Korean-origin engineers). He has chaired the Michigan Computer Science and Engineering Division for four years starting 1991 and also several major conferences, including 2009 ACM MobiCom and 2005 ACM/USENIX MobiSys. He was a co-founder of a couple of startups, licensed some of his technologies to industry, and served as an Executive Advisor for Samsung Research.

Day 2, June 23 (Tue.), 2026, 9:00 - 10:30, Room: Grand Ballroom

Title: Quo Vadis, Parallel Computing Systems?

Abstract: Parallel computing is entering a new era, driven by the rapid rise of AI, and undergoing a significant transformation. Previously associated primarily with supercomputers and scientific workloads, it now serves as the foundation of contemporary AI infrastructure. Large-scale training, real-time inference, agentic systems, simulation, robotics, and physical AI require extensive parallelism across accelerators, memory, networks, storage, software stacks, and energy infrastructure.

This talk explores the future direction of parallel computing systems in the context of AI. We will argue that the next generation of systems will be defined by comprehensive full-stack co-design, rather than by incremental improvements in processor speed or cluster size. Industry initiatives such as NVIDIA’s AI factories and Google’s AI Hypercomputer exemplify this transition, demonstrating that chips, interconnects, runtimes, orchestration layers, models, applications, and power infrastructure must be designed and optimized together.

The talk will address major challenges, including heterogeneous accelerators, communication bottlenecks, scheduling, reliability, observability, data locality, energy efficiency, sustainability, and openness. It will also highlight opportunities in AI-native orchestration, compiler-runtime-system co-design, distributed inference, intelligent infrastructure management, and sustainable large-scale computing. Ultimately, in the AI era, parallel computing systems are evolving from specialized backend infrastructure into the foundational operating platform for future intelligent systems.

Bio: Jaejin Lee is a professor in the Department of Data Science/Graduate School of Data Science (Dean) and the Department of Computer Science and Engineering/College of Engineering at Seoul National University (SNU). He is also the director of the Center for Optimizing Hyperscale AI Models and Platforms (CHAMP) and the leader of the Thunder research group. He received his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign (UIUC) in 1999. He received an M.S. in Computer Science from Stanford University in 1995 and a B.S. in Physics from SNU in 1991. He is an IEEE fellow, and his current research interests include programming systems of heterogeneous machines (GPUs and FPGAs), building Large Language Models (LLMs) and datasets, parallelization and optimization of LLM frameworks, and programming environments of quantum computers.

Title: Towards Memory-Efficient LLM Inference for On-device AI

Abstract: The deployment of Large Language Models (LLMs) on devices faces significant challenges due to their extensive memory requirements. This talk will introduce a series of work towards memory-efficient LLM from the perspective of model size as well as from the prestige of KV cache. For example, Double Compression combines model compression (quantization and pruning) with lossless data compression, achieving a 2.2x compression ratio while maintaining model accuracy within a 1% drop. It optimizes weight distribution and employs adaptive decompression to balance memory usage and inference speed. As another example, FlexInfer is a work that leverages several advanced system techniques such as prefetching and memory locking to maximize memory efficiency and minimize I/O overhead. It achieves up to 12.5x faster inference under memory constraints compared to traditional methods. Together, these solutions enable memory-efficient LLM deployment on edge devices, bridging the gap between model size and hardware limitations. We will also present our recent project of ClawMobile, which runs efficient agents on mobile devices.

Bio: Prof. Chun Jason Xue is currently a professor of computer science at MBZUAI university, Abu Dhabi. His research focuses on memory and storage systems. He is current associate editor for ACM Transactions on Embedded Computing Systems, ACM Transaction on CPS, and ACM Transactions on Storage. He is a distinguished member of ACM, and a fellow of IEEE.