Appearance
2025-04-28
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| 3D MPSoC with On-Chip Cache Support -- Design and Exploitation | Rodrigo Cataldo, Cesar Marcon, Debora Matos | 2025-04-28 | 下载 | The increasing density of transistors in Integrated Circuits (ICs) has enabled the development of highly integrated Systems-on-Chip (SoCs) and, more recently, Multiprocessor Systems-on-Chip (MPSoCs). |
| From Concept to Practice: an Automated LLM-aided UVM Machine for RTL Verification | Junhao Ye, Yuchen Hu, Ke Xu, Dingrong Pan, Qichun Chen, Jie Zhou, Shuai Zhao, Xinwei Fang, Xi Wang, Nan Guan, Zhe Jiang | 2025-04-28 | 下载 | Verification presents a major bottleneck in Integrated Circuit (IC) development, consuming nearly 70% of the total development effort. While the Universal Verification Methodology (UVM) is widely used... |
| FoldedHexaTorus: An Inter-Chiplet Interconnect Topology for Chiplet-based Systems using Organic and Glass Substrates | Patrick Iff, Maciej Besta, Torsten Hoefler | 2025-04-28 | 下载 | Chiplet-based systems are rapidly gaining traction in the market. Two packaging options for such systems are the established organic substrates and the emerging glass substrates. |
| Dynamic Tsetlin Machine Accelerators for On-Chip Training at the Edge using FPGAs | Gang Mao, Tousif Rahman, Sidharth Maheshwari, Bob Pattison, Zhuang Shao, Rishad Shafik, Alex Yakovlev | 2025-04-28 | 下载 | The increased demand for data privacy and security in machine learning (ML) applications has put impetus on effective edge training on Internet-of-Things (IoT) nodes. |
| FineQ: Software-Hardware Co-Design for Low-Bit Fine-Grained Mixed-Precision Quantization of LLMs | Xilong Xie, Liang Wang, Limin Xiao, Meng Han, Lin Sun, Shuai Zheng, Xiangrong Xu | 2025-04-28 | 下载 | Large language models (LLMs) have significantly advanced the natural language processing paradigm but impose substantial demands on memory and computational resources. |
| Hardware/Software Co-Design of RISC-V Extensions for Accelerating Sparse DNNs on FPGAs | Muhammad Sabih, Abrarul Karim, Jakob Wittmann, Frank Hannig, Jürgen Teich | 2025-04-28 | 下载 | The customizability of RISC-V makes it an attractive choice for accelerating deep neural networks (DNNs). It can be achieved through instruction set extensions and corresponding custom functional unit... |
| Intelligent4DSE: Optimizing High-Level Synthesis Design Space Exploration with Graph Neural Networks and Large Language Models | Lei Xu, Shanshan Wang, Emmanuel Casseau, Chenglong Xiao | 2025-04-28 | 下载 | High-Level Synthesis (HLS) Design Space Exploration (DSE) is essential for generating hardware designs that balance performance, power, and area (PPA). |
| ChipletQuake: On-die Digital Impedance Sensing for Chiplet and Interposer Verification | Saleh Khalaj Monfared, Maryam Saadat Safa, Shahin Tajik | 2025-04-28 | 下载 | The increasing complexity and cost of manufacturing monolithic chips have driven the semiconductor industry toward chiplet-based designs, where smaller and modular chiplets are integrated onto a singl... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| SoK: A Survey of Mixing Techniques and Mixers for Cryptocurrencies | Juraj Mariani, Ivan Homoliak | 2025-04-28 | 下载 | Blockchain technologies have overturned the digital finance industry by introducing a decentralized pseudonymous means of monetary transfer. The pseudonymous nature introduced privacy concerns, enabli... |
| Leveraging Neural Graph Compilers in Machine Learning Research for Edge-Cloud Systems | Alireza Furutanpey, Carmen Walser, Philipp Raith, Pantelis A. Frangoudis, Schahram Dustdar | 2025-04-28 | 下载 | This work presents a comprehensive evaluation of neural network graph compilers across heterogeneous hardware platforms, addressing the critical gap between theoretical optimization techniques and pra... |
| Cosmos: A Cost Model for Serverless Workflows in the 3D Compute Continuum | Cynthia Marcelino, Sebastian Gollhofer-Berger, Thomas Pusztai, Stefan Nastic | 2025-04-28 | 下载 | Due to the high scalability, infrastructure management, and pay-per-use pricing model, serverless computing has been adopted in a wide range of applications such as real-time data processing, IoT, and... |
| Network-Aware Scheduling for Remote Gate Execution in Quantum Data Centers | Shahrooz Pouryousef, Reza Nejabati, Don Towsley, Ramana Kompella, Eneet Kaur | 2025-04-28 | 下载 | Modular quantum computing provides a scalable approach to overcome the limitations of monolithic quantum architectures by interconnecting multiple Quantum Processing Units (QPUs) through a quantum net... |
| SYMI: Efficient Mixture-of-Experts Training via Model and Optimizer State Decoupling | Athinagoras Skiadopoulos, Mark Zhao, Swapnil Gandhi, Thomas Norrie, Shrijeet Mukherjee, Christos Kozyrakis | 2025-04-28 | 下载 | Mixture-of-Experts (MoE) models have become a widely-adopted solution to continue scaling model sizes without a corresponding linear increase in compute. |
| semi-PD: Towards Efficient LLM Serving via Phase-Wise Disaggregated Computation and Unified Storage | Ke Hong, Lufang Chen, Zhong Wang, Xiuhong Li, Qiuli Mao, Jianping Ma, Chao Xiong, Guanyu Wu, Buhe Han, Guohao Dai, Yun Liang, Yu Wang | 2025-04-28 | 下载 | Existing large language model (LLM) serving systems fall into two categories: 1) a unified system where prefill phase and decode phase are co-located on the same GPU, sharing the unified computational... |
| Taming the Titans: A Survey of Efficient LLM Inference Serving | Ranran Zhen, Juntao Li, Yixin Ji, Zhenlin Yang, Tong Liu, Qingrong Xia, Xinyu Duan, Zhefeng Wang, Baoxing Huai, Min Zhang | 2025-04-28 | 下载 | Large Language Models (LLMs) for Generative AI have achieved remarkable progress, evolving into sophisticated and versatile tools widely adopted across various domains and applications. |
| Efficient and Adaptable Overlapping for Computation and Communication via Signaling and Reordering | Ke Hong, Xiuhong Li, Minxu Liu, Qiuli Mao, Tianqi Wu, Zixiao Huang, Lufang Chen, Zhong Wang, Yichong Zhang, Zhenhua Zhu, Guohao Dai, Yu Wang | 2025-04-28 | 下载 | Generative models have achieved remarkable success across various applications, driving the demand for multi-GPU computing. Inter-GPU communication becomes a bottleneck in multi-GPU computing systems,... |
| Boosting LLM Serving through Spatial-Temporal GPU Resource Sharing | Zejia Lin, Hongxin Xu, Guanyi Chen, Zhiguang Chen, Yutong Lu, Xianwei Zhang | 2025-04-28 | 下载 | Modern LLM serving systems confront inefficient GPU utilization due to the fundamental mismatch between compute-intensive prefill and memory-bound decode phases. |
| Adjusted Objects: An Efficient and Principled Approach to Scalable Programming (Extended Version) | Boubacar Kane, Pierre Sutra | 2025-04-28 | 下载 | Parallel programs require software support to coordinate access to shared data. For this purpose, modern programming languages provide strongly-consistent shared objects. |
| Triton-distributed: Programming Overlapping Kernels on Distributed AI Systems with the Triton Compiler | Size Zheng, Wenlei Bao, Qi Hou, Xuegui Zheng, Jin Fang, Chenhui Huang, Tianqi Li, Haojie Duanmu, Renze Chen, Ruifan Xu, Yifan Guo, Ningxin Zheng, Ziheng Jiang, Xinyi Di, Dongyang Wang, Jianxi Ye, Haibin Lin, Li-Wen Chang, Liqiang Lu, Yun Liang, Jidong Zhai, Xin Liu | 2025-04-28 | 下载 | In this report, we propose Triton-distributed, an extension of existing Triton compiler, to overcome the programming challenges in distributed AI systems. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Virtual Cybersecurity Department for Securing Digital Twins in Water Distribution Systems | Mohammadhossein Homaei, Agustin Di Bartolo, Oscar Mogollon-Gutierrez, Fernando Broncano Morgado, Pablo Garcia Rodriguez | 2025-04-28 | 下载 | Digital twins (DTs) help improve real-time monitoring and decision-making in water distribution systems. However, their connectivity makes them easy targets for cyberattacks such as scanning, denial-o... |
| Tree embedding based mapping system for low-latency mobile applications in multi-access networks | Yu Mi, Randeep Bhatia, Fang Hao, An Wang, Steve Benno, Tv Lakshman | 2025-04-28 | 下载 | Low-latency applications like AR/VR and online gaming need fast, stable connections. New technologies such as V2X, LEO satellites, and 6G bring unique challenges in mobility management. |
| Network-Aware Scheduling for Remote Gate Execution in Quantum Data Centers | Shahrooz Pouryousef, Reza Nejabati, Don Towsley, Ramana Kompella, Eneet Kaur | 2025-04-28 | 下载 | Modular quantum computing provides a scalable approach to overcome the limitations of monolithic quantum architectures by interconnecting multiple Quantum Processing Units (QPUs) through a quantum net... |
| Mixture of Experts for Decentralized Generative AI and Reinforcement Learning in Wireless Networks: A Comprehensive Survey | Yunting Xu, Jiacheng Wang, Ruichen Zhang, Changyuan Zhao, Dusit Niyato, Jiawen Kang, Zehui Xiong, Bo Qian, Haibo Zhou, Shiwen Mao, Abbas Jamalipour, Xuemin Shen, Dong In Kim | 2025-04-28 | 下载 | Mixture of Experts (MoE) has emerged as a promising paradigm for scaling model capacity while preserving computational efficiency, particularly in large-scale machine learning architectures such as la... |
| Automatic Configuration Protocols for Optical Quantum Networks | Amin Taherkhani, Andrew Todd, Kentaro Teramoto, Rodney Van Meter, Shota Nagayama | 2025-04-28 | 下载 | Before quantum networks can scale up to practical sizes, there are many deployment and configuration tasks that must be automated. Currently, quantum networking testbeds are largely manually configure... |
| Lifecycle Management of Optical Networks with Dynamic-Updating Digital Twin: A Hybrid Data-Driven and Physics-Informed Approach | Yuchen Song, Min Zhang, Yao Zhang, Yan Shi, Shikui Shen, Xiongyan Tang, Shanguo Huang, Danshi Wang | 2025-04-28 | 下载 | Digital twin (DT) techniques have been proposed for the autonomous operation and lifecycle management of next-generation optical networks. To fully utilize potential capacity and accommodate dynamic s... |
| Graph Reinforcement Learning for QoS-Aware Load Balancing in Open Radio Access Networks | Omid Semiari, Hosein Nikopour, Shilpa Talwar | 2025-04-28 | 下载 | Next-generation wireless cellular networks are expected to provide unparalleled Quality-of-Service (QoS) for emerging wireless applications, necessitating strict performance guarantees, e.g. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Ariel OS: An Embedded Rust Operating System for Networked Sensors & Multi-Core Microcontrollers | Elena Frank, Kaspar Schleiser, Romain Fouquet, Koen Zandberg, Christian Amsüss, Emmanuel Baccelli | 2025-04-28 | 下载 | Large swaths of low-level system software building blocks originally implemented in C/C++ are currently being swapped for equivalent rewrites in Rust, a relatively more secure and dependable programmi... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Network-Aware Scheduling for Remote Gate Execution in Quantum Data Centers | Shahrooz Pouryousef, Reza Nejabati, Don Towsley, Ramana Kompella, Eneet Kaur | 2025-04-28 | 下载 | Modular quantum computing provides a scalable approach to overcome the limitations of monolithic quantum architectures by interconnecting multiple Quantum Processing Units (QPUs) through a quantum net... |