Skip to content

2026-04-12

cs.AR - Architecture

标题作者发布日期PDF摘要
The xPU-athalon: Quantifying the Competition of AI AccelerationAlicia Golden, Carole-Jean Wu, Gu-Yeon Wei, David Brooks2026-04-12下载The push for greater efficiency in AI computation has given rise to an array of accelerator architectures that increasingly challenge the GPU's long-standing dominance.
Harnessing Photonics for Machine IntelligenceHanqing Zhu, Shupeng Ning, Hongjian Zhou, Ziang Yin, Ray T. Chen, Jiaqi Gu, David Z. Pan2026-04-12下载The exponential growth of machine-intelligence workloads is colliding with the power, memory, and interconnect limits of the post-Moore era, motivating compute substrates that scale beyond transistor ...
EMSpice 3: Full-chip Temperature-Aware Multiphysics Electromigration and IR-Drop AnalysisHaotian Lu, Sheldon X. -D. Tan2026-04-12下载In this work, we present EMSpice 3, a full-chip temperature-aware multiphysics framework for coupled electromigration (EM), thermomigration (TM), and IR-drop analysis of power-grid networks.
L-PCN: A Point Cloud Accelerator Exploiting Spatial Locality through Octree-based IslandizationYiming Gao, Jieming Yin, Yuxiang Wang, Xiangru Chen, Zhilei Chai, Bowen Jiang, Jiliang Zhang, Herman Lam2026-04-12下载Existing Point Cloud Networks (PCNs) have proven to achieve great success in many point cloud tasks such as object part segmentation, shape classification, and so on.
From Characterization to Microarchitecture: Designing an Elegant and Reliable BFP-Based NPUJie Zhang, Jiapeng Guan, Hao Zhou, Xiaomeng Han, Tinglue Wang, Ran Wei, Zhe Jiang2026-04-12下载Block Floating-Point (BFP) is emerging as an attractive data format for edge Neural Processing Units (NPUs), combining wide dynamic range with high hardware efficiency.
Strix: Re-thinking NPU Reliability from a System PerspectiveJiapeng Guan, Jie Zhang, Hao Zhou, Ran Wei, Dean You, Hui Wang, Yingquan Wang, Tinglue Wang, Xudong Zhao, Jing Li, Zhe Jiang2026-04-12下载DNNs and LLMs increasingly rely on hardware accelerators, including in safety-critical domains, while technology scaling and growing model complexity make hardware faults more frequent.
LLM-PRISM: Characterizing Silent Data Corruption from Permanent GPU Faults in LLM TrainingAbhishek Tyagi, Saurabh Hukerikar, Nirmal Saxena, Yanxiang Huang, Philip Shirvani, Chung-Hsuan Tung, Yuhao Zhu2026-04-12下载Large-scale LLM training is increasingly susceptible to hardware defects stemming from manufacturing escapes and silicon aging. These defects manifest as Silent Data Corruption (SDC) that perturb grad...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Understanding Communication Backends in Cross-Silo Federated LearningAmir Ziashahabi, Chaoyang He, Salman Avestimehr2026-04-12下载Federated learning (FL) has emerged as a practical means for privacy-preserving distributed machine learning. FL's versatile design makes it suitable for various training settings, from IoT edge devic...
Workload composition smooths aggregate power demand while sustaining short-horizon ramps in AI data centersSubir Majumder, Minlan Yu, Le Xie2026-04-12下载Artificial intelligence (AI) is driving rapid growth in electricity demand, yet the grid-facing power dynamics of AI data centers remain poorly understood.
Bipartite matching under communication constraintsMoonmoon Mohanty, Gautham Bolar, Preetam Patil, Ayalvadi Ganesh, Jean-Francois Chamberland, Parimal Parag2026-04-12下载In modern data center networks, thousands of hosts contend for shared link capacity; the scale of these systems makes centralized scheduling impractical.
COD-ssi: Enforcing Mutual Privacy for Credential Oblivious Disclosure in Self Sovereign IdentityElia Onofri, Andrea De Salve, Paolo Mori, Laura Emilia Maria Ricci, Roberto Di Pietro2026-04-12下载The Self-Sovereign Identity (SSI) paradigm is instrumental for decentralised identity management, allowing an entity to create, manage, and present their digital credentials without relying on central...
FEDBUD: Joint Incentive and Privacy Optimization for Resource-Constrained Federated LearningTao Liu, Xuehe Wang2026-04-12下载Federated learning has become a popular paradigm for privacy protection and edge-based machine learning. However, defending against differential attacks and devising incentive strategies remain signif...
CIR: Lightweight Container Image for Cross-Platform DeploymentFengzhi Li, Xiaohui Peng, Qingru Xu, Qisong Shi, Tuo Zhou, Yongxuan Dai, Yifan Wang, Ninghui Sun, Zhiwei Xu2026-04-12下载In modern cloud and heterogeneous distributed infrastructures, container images are widely used as the deployment unit for machine learning applications.
Leveraging Mathematical Reasoning of LLMs for Efficient GPU Thread MappingJose Maureira, Cristóbal A. Navarro, Hector Ferrada, Luis Veas-Castillo2026-04-12下载Mapping parallel threads onto non-box-shaped domains is a known challenge in GPU computing that, if done efficiently, can prevent severe performance penalties from allocating unnecessary computational...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Bipartite matching under communication constraintsMoonmoon Mohanty, Gautham Bolar, Preetam Patil, Ayalvadi Ganesh, Jean-Francois Chamberland, Parimal Parag2026-04-12下载In modern data center networks, thousands of hosts contend for shared link capacity; the scale of these systems makes centralized scheduling impractical.

基于 VitePress 构建