Appearance
2026-03-13
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| NCCL EP: Towards a Unified Expert Parallel Communication API for NCCL | Amos Goldman, Nimrod Boker, Maayan Sheraizin, Nimrod Admoni, Artem Polyakov, Subhadeep Bhattacharya, Fan Yu, Kai Sun, Georgios Theodorakis, Hsin-Chun Yin, Peter-Jan Gootzen, Aamir Shafi, Assaf Ravid, Salvatore Di Girolamo, James Dinan, Xiaofan Li, Manjunath Gorentla Venkata, Gil Bloch | 2026-03-13 | 下载 | Mixture-of-Experts (MoE) architectures have become essential for scaling large language models, driving the development of specialized device-initiated communication libraries such as DeepEP, Hybrid-E... |
| Breaking the Tuning Barrier: Zero-Hyperparameters Yield Multi-Corner Analysis Via Learned Priors | Wei W. Xing, Kaiqi Huang, Jiazhan Liu, Hong Qiu, Shan Shen | 2026-03-13 | 下载 | Yield Multi-Corner Analysis validates circuits across 25+ Process-Voltage-Temperature corners, resulting in a combinatorial simulation cost of where denotes corners and exceeds... |
| DRCY: Agentic Hardware Design Reviews | Kyle Dumont, Nicholas Herbert, Hayder Tirmazi, Shrikanth Upadhayaya | 2026-03-13 | 下载 | Hardware design errors discovered after fabrication require costly physical respins that can delay products by months. Existing electronic design automation (EDA) tools enforce structural connectivity... |
| OpenACMv2: An Accuracy-Constrained Co-Optimization Framework for Approximate DCiM | Yiqi Zhou, Yue Yuan, Yikai Wang, Bohao Liu, Qinxin Mei, Zhuohua Liu, Shan Shen, Wei Xing, Daying Sun, Li Li, Guozhu Liu | 2026-03-13 | 下载 | Digital Compute-in-Memory (DCiM) accelerates neural networks by reducing data movement. Approximate DCiM can further improve power-performance-area (PPA), but demands accuracy-constrained co-optimizat... |
| CellE: Automated Standard Cell Library Extension via Equality Saturation | Yi Ren, Yukun Wang, Xiang Meng, Guoyao Cheng, Baokang Peng, Lining Zhang, Yibo Lin, Runsheng Wang, Guangyu Sun | 2026-03-13 | 下载 | Automated standard cell library extension is crucial for maximizing Quality of Results (QoR) in modern VLSI design. We introduce CellE, a novel framework that leverages formal methods to achieve exhau... |
| SRAM-Based Compute-in-Memory Accelerator for Linear-decay Spiking Neural Networks | Hongyang Shang, Shuai Dong, Yahan Yang, Junyi Yang, Peng Zhou, Arindam Basu | 2026-03-13 | 下载 | Spiking Neural Networks (SNNs) have emerged as a biologically inspired alternative to conventional deep networks, offering event-driven and energy-efficient computation. |
| ExpanderGraph-128: A Novel Graph-Theoretic Block Cipher with Formal Security Analysis and Hardware Implementation | W. A. Susantha Wijesinghe | 2026-03-13 | 下载 | Lightweight block cipher design has largely focused on incremental optimization of established paradigms such as substitution--permutation networks, Feistel structures, and ARX constructions, where se... |
| Dynamic Sparse Attention: Access Patterns and Architecture | Noam Levy | 2026-03-13 | 下载 | Dynamic sparse attention (DSA) reduces the per-token attention bandwidth by restricting computation to a top-k subset of cached key-value (KV) entries, but its token-dependent selection pattern introd... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| NCCL EP: Towards a Unified Expert Parallel Communication API for NCCL | Amos Goldman, Nimrod Boker, Maayan Sheraizin, Nimrod Admoni, Artem Polyakov, Subhadeep Bhattacharya, Fan Yu, Kai Sun, Georgios Theodorakis, Hsin-Chun Yin, Peter-Jan Gootzen, Aamir Shafi, Assaf Ravid, Salvatore Di Girolamo, James Dinan, Xiaofan Li, Manjunath Gorentla Venkata, Gil Bloch | 2026-03-13 | 下载 | Mixture-of-Experts (MoE) architectures have become essential for scaling large language models, driving the development of specialized device-initiated communication libraries such as DeepEP, Hybrid-E... |
| A common parallel framework for LLP combinatorial problems | David Ribeiro Alves, Vijay K. Garg | 2026-03-13 | 下载 | Traditional lock-free parallel algorithms for combinatorial optimization problems, such as shortest paths, stable matching, and job scheduling require programmers to write problem-specific routines an... |
| Federated Few-Shot Learning on Neuromorphic Hardware: An Empirical Study Across Physical Edge Nodes | Steven Motta, Gioele Nanni | 2026-03-13 | 下载 | Federated learning on neuromorphic hardware remains unexplored because on-chip spike-timing-dependent plasticity (STDP) produces binary weight updates rather than the floating-point gradients assumed ... |
| ARL-Tangram: Unleash the Resource Efficiency in Agentic Reinforcement Learning | Bangjun Xiao, Yihao Zhao, Xiangwei Deng, Shihua Yu, Yuxing Xiang, Huaqiu Liu, Qiying Wang, Liang Zhao, Hailin Zhang, Xuanzhe Liu, Xin Jin, Fuli Luo | 2026-03-13 | 下载 | Agentic reinforcement learning (RL) has emerged as a transformative workload in cloud clusters, enabling large language models (LLMs) to solve complex problems through interactions with real world. |
| A New Kernel Regularity Condition for Distributed Mirror Descent: Broader Coverage and Simpler Analysis | Junwen Qiu, Ziyang Zeng, Leilei Mei, Junyu Zhang | 2026-03-13 | 下载 | Existing convergence of distributed optimization methods in non-Euclidean geometries typically rely on kernel assumptions: (i) global Lipschitz smoothness and (ii) bi-convexity of the associated Bregm... |
| Serving Hybrid LLM Loads with SLO Guarantees Using CPU-GPU Attention Piggybacking | Zizhao Mo, Junlin Chen, Huanle Xu, Chengzhong Xu | 2026-03-13 | 下载 | Nowadays, service providers often deploy multiple types of LLM services within shared clusters. While the service colocation improves resource utilization, it introduces significant interference risks... |
| Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity | Donglin Yu | 2026-03-13 | 下载 | Multimodal large language model (MLLM) inference splits into two phases with opposing hardware demands: vision encoding is compute-bound, while language generation is memory-bandwidth-bound. |
| Federated Hierarchical Clustering with Automatic Selection of Optimal Cluster Numbers | Yue Zhang, Chuanlong Qiu, Xinfa Liao, Yiqun Zhang | 2026-03-13 | 下载 | Federated Clustering (FC) is an emerging and promising solution in exploring data distribution patterns from distributed and privacy-protected data in an unsupervised manner. |
| Streaming REST APIs for Large Financial Transaction Exports from Relational Databases | Abhiram Kandiraju | 2026-03-13 | 下载 | Financial platforms and enterprise systems frequently provide transaction export capabilities to support reporting, reconciliation, auditing, and regulatory compliance workflows. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Linnaeus: A Hierarchical, Multi-Label Framework for Autonomous System Classification | Marcos Piotto, Ignacio Schuemer, Santiago T. Torres, Mariano G. Beiró, Esteban Carisimo, Fabián E. Bustamante | 2026-03-13 | 下载 | Autonomous systems (ASes) play diverse roles in today's Internet, from community and research backbones to hyperscale content providers and submarine-cable operators. |
| PLUME: Building a Network-Native Foundation Model for Wireless Traces via Protocol-Aware Tokenization | Swadhin Pradhan, Shazal Irshad, Jerome Henry | 2026-03-13 | 下载 | Foundation models succeed when they learn in the native structure of a modality, whether morphology-respecting tokens in language or pixels in vision. |
| SAIL: Unsupervised Spatial-Angular Interpretable Feature Learning for RF Map Synthesis | Sopan Sarkar, Marwan Krunz | 2026-03-13 | 下载 | In wireless networks, radio-frequency (RF) maps are critical for tasks such as capacity planning, coverage estimation, and localization. Traditional approaches for obtaining RF maps, including site su... |
| AI-Enabled Bit-Mapping Medium Access Control Protocol for Intelligent and Energy-Efficient IoT Networks | Jesmine Damilola Omonori, Iyanu Tomiwa Durotola, Godspower Paul Osilama | 2026-03-13 | 下载 | Energy-efficient medium access control (MAC) protocols remain critical in resource-constrained Wireless Sensor Networks (WSNs) and IoT deployments, especially under mixed traffic patterns that combine... |
| End-to-End O-RAN Testbed for Edge-AI-Enabled 5G/6G Connected Industrial Robotics | Sasa Talosi, Vladimir Vincan, Srdjan Sobot, Goran Martic, Vladimir Morosev, Vukan Ninkovic, Dragisa Miskovic, Dejan Vukobratovic | 2026-03-13 | 下载 | Connected robotics is one of the principal use cases driving the transition towards more intelligent and capable 6G mobile cellular networks. Replacing wired connections with highly reliable, high-thr... |
| Goal-Oriented Learning at the Edge: Graph Neural Networks Over-the-Air for Blockage Prediction | Lorenzo Mario Amorosa, Zhan Gao, Tony Chahoud, Yiqun Wu, Lukas Eller, Marco Skocaj, Roberto Verdone | 2026-03-13 | 下载 | Sixth-generation (6G) wireless networks evolve from connecting devices to connecting intelligence. The focus turns to Goal-Oriented Communications, where the effectiveness of communication is assessed... |
| Semantic-Aware 6G Network Management through Knowledge-Defined Networking | Tuğçe Bilen, Ian F. Akyildiz | 2026-03-13 | 下载 | Semantic communication is emerging as a key paradigm for 6G networks, where the goal is not to perfectly reconstruct bits but to preserve the meaning that matters for a given task. |
| HyGra: Accelerating Network-State Simulation for LLM Training in DCNs via Adaptive Packet-Flow Granularity | Wenyi Wang, Zheng Wu, Yanmeng Wang, Haolin Mao, Lei Han, Gaogang Xie, Fu Xiao | 2026-03-13 | 下载 | In recent years, large language models (LLMs) have driven substantial intelligent transformation across diverse industries. Commercial LLM training is typically performed over data center networks (DC... |
| Evaluation of TCP Congestion Control for Public High-Performance Wide-Area Networks | Fatih Berkay Sarpkaya, Andrea Francini, Bilgehan Erman, Shivendra Panwar | 2026-03-13 | 下载 | Practitioners of a growing number of scientific and artificial-intelligence (AI) applications use High-Performance Wide-Area Networks (HP-WANs) for moving massive data sets between remote facilities. |
| A Standards-Aligned Coordination Framework for Edge-Enhanced Collaborative Healthcare in 6G Networks | Liuwang Kang, Fan Wang, Yuzhang Huang, Shang Yan, Jianbin Zheng, Wenbin Lei, Konstantin Yakovlev, Jie Tang, Shaoshan Liu | 2026-03-13 | 下载 | Mission-critical healthcare applications including real-time intensive care monitoring, ambulance-to-hospital orchestration, and distributed medical imaging inference require workflow-level, time-boun... |
| OFDM Waveform for Monostatic ISAC in 6G: Vision, Approach, and Research Directions | Huacheng Zeng, Kunzhe Song, Geo Jie Zhou, Ruxin Lin | 2026-03-13 | 下载 | Integrated sensing and communication (ISAC) is widely regarded as a key enabling technology for 6G wireless networks. While extensive research has explored the coexistence of sensing and communication... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| AgentRM: An OS-Inspired Resource Manager for LLM Agent Systems | Jianshu She | 2026-03-13 | 下载 | Large Language Model (LLM) agent systems have experienced rapid adoption across diverse domains, yet they suffer from critical user experience problems that limit their practical deployment. |