Skip to content

2026-03-13

cs.AR - Architecture

标题作者发布日期PDF摘要
NCCL EP: Towards a Unified Expert Parallel Communication API for NCCLAmos Goldman, Nimrod Boker, Maayan Sheraizin, Nimrod Admoni, Artem Polyakov, Subhadeep Bhattacharya, Fan Yu, Kai Sun, Georgios Theodorakis, Hsin-Chun Yin, Peter-Jan Gootzen, Aamir Shafi, Assaf Ravid, Salvatore Di Girolamo, James Dinan, Xiaofan Li, Manjunath Gorentla Venkata, Gil Bloch2026-03-13下载Mixture-of-Experts (MoE) architectures have become essential for scaling large language models, driving the development of specialized device-initiated communication libraries such as DeepEP, Hybrid-E...
Breaking the Tuning Barrier: Zero-Hyperparameters Yield Multi-Corner Analysis Via Learned PriorsWei W. Xing, Kaiqi Huang, Jiazhan Liu, Hong Qiu, Shan Shen2026-03-13下载Yield Multi-Corner Analysis validates circuits across 25+ Process-Voltage-Temperature corners, resulting in a combinatorial simulation cost of O(K×N)O(K \times N) where KK denotes corners and NN exceeds...
DRCY: Agentic Hardware Design ReviewsKyle Dumont, Nicholas Herbert, Hayder Tirmazi, Shrikanth Upadhayaya2026-03-13下载Hardware design errors discovered after fabrication require costly physical respins that can delay products by months. Existing electronic design automation (EDA) tools enforce structural connectivity...
OpenACMv2: An Accuracy-Constrained Co-Optimization Framework for Approximate DCiMYiqi Zhou, Yue Yuan, Yikai Wang, Bohao Liu, Qinxin Mei, Zhuohua Liu, Shan Shen, Wei Xing, Daying Sun, Li Li, Guozhu Liu2026-03-13下载Digital Compute-in-Memory (DCiM) accelerates neural networks by reducing data movement. Approximate DCiM can further improve power-performance-area (PPA), but demands accuracy-constrained co-optimizat...
CellE: Automated Standard Cell Library Extension via Equality SaturationYi Ren, Yukun Wang, Xiang Meng, Guoyao Cheng, Baokang Peng, Lining Zhang, Yibo Lin, Runsheng Wang, Guangyu Sun2026-03-13下载Automated standard cell library extension is crucial for maximizing Quality of Results (QoR) in modern VLSI design. We introduce CellE, a novel framework that leverages formal methods to achieve exhau...
SRAM-Based Compute-in-Memory Accelerator for Linear-decay Spiking Neural NetworksHongyang Shang, Shuai Dong, Yahan Yang, Junyi Yang, Peng Zhou, Arindam Basu2026-03-13下载Spiking Neural Networks (SNNs) have emerged as a biologically inspired alternative to conventional deep networks, offering event-driven and energy-efficient computation.
ExpanderGraph-128: A Novel Graph-Theoretic Block Cipher with Formal Security Analysis and Hardware ImplementationW. A. Susantha Wijesinghe2026-03-13下载Lightweight block cipher design has largely focused on incremental optimization of established paradigms such as substitution--permutation networks, Feistel structures, and ARX constructions, where se...
Dynamic Sparse Attention: Access Patterns and ArchitectureNoam Levy2026-03-13下载Dynamic sparse attention (DSA) reduces the per-token attention bandwidth by restricting computation to a top-k subset of cached key-value (KV) entries, but its token-dependent selection pattern introd...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
NCCL EP: Towards a Unified Expert Parallel Communication API for NCCLAmos Goldman, Nimrod Boker, Maayan Sheraizin, Nimrod Admoni, Artem Polyakov, Subhadeep Bhattacharya, Fan Yu, Kai Sun, Georgios Theodorakis, Hsin-Chun Yin, Peter-Jan Gootzen, Aamir Shafi, Assaf Ravid, Salvatore Di Girolamo, James Dinan, Xiaofan Li, Manjunath Gorentla Venkata, Gil Bloch2026-03-13下载Mixture-of-Experts (MoE) architectures have become essential for scaling large language models, driving the development of specialized device-initiated communication libraries such as DeepEP, Hybrid-E...
A common parallel framework for LLP combinatorial problemsDavid Ribeiro Alves, Vijay K. Garg2026-03-13下载Traditional lock-free parallel algorithms for combinatorial optimization problems, such as shortest paths, stable matching, and job scheduling require programmers to write problem-specific routines an...
Federated Few-Shot Learning on Neuromorphic Hardware: An Empirical Study Across Physical Edge NodesSteven Motta, Gioele Nanni2026-03-13下载Federated learning on neuromorphic hardware remains unexplored because on-chip spike-timing-dependent plasticity (STDP) produces binary weight updates rather than the floating-point gradients assumed ...
ARL-Tangram: Unleash the Resource Efficiency in Agentic Reinforcement LearningBangjun Xiao, Yihao Zhao, Xiangwei Deng, Shihua Yu, Yuxing Xiang, Huaqiu Liu, Qiying Wang, Liang Zhao, Hailin Zhang, Xuanzhe Liu, Xin Jin, Fuli Luo2026-03-13下载Agentic reinforcement learning (RL) has emerged as a transformative workload in cloud clusters, enabling large language models (LLMs) to solve complex problems through interactions with real world.
A New Kernel Regularity Condition for Distributed Mirror Descent: Broader Coverage and Simpler AnalysisJunwen Qiu, Ziyang Zeng, Leilei Mei, Junyu Zhang2026-03-13下载Existing convergence of distributed optimization methods in non-Euclidean geometries typically rely on kernel assumptions: (i) global Lipschitz smoothness and (ii) bi-convexity of the associated Bregm...
Serving Hybrid LLM Loads with SLO Guarantees Using CPU-GPU Attention PiggybackingZizhao Mo, Junlin Chen, Huanle Xu, Chengzhong Xu2026-03-13下载Nowadays, service providers often deploy multiple types of LLM services within shared clusters. While the service colocation improves resource utilization, it introduces significant interference risks...
Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU HeterogeneityDonglin Yu2026-03-13下载Multimodal large language model (MLLM) inference splits into two phases with opposing hardware demands: vision encoding is compute-bound, while language generation is memory-bandwidth-bound.
Federated Hierarchical Clustering with Automatic Selection of Optimal Cluster NumbersYue Zhang, Chuanlong Qiu, Xinfa Liao, Yiqun Zhang2026-03-13下载Federated Clustering (FC) is an emerging and promising solution in exploring data distribution patterns from distributed and privacy-protected data in an unsupervised manner.
Streaming REST APIs for Large Financial Transaction Exports from Relational DatabasesAbhiram Kandiraju2026-03-13下载Financial platforms and enterprise systems frequently provide transaction export capabilities to support reporting, reconciliation, auditing, and regulatory compliance workflows.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Linnaeus: A Hierarchical, Multi-Label Framework for Autonomous System ClassificationMarcos Piotto, Ignacio Schuemer, Santiago T. Torres, Mariano G. Beiró, Esteban Carisimo, Fabián E. Bustamante2026-03-13下载Autonomous systems (ASes) play diverse roles in today's Internet, from community and research backbones to hyperscale content providers and submarine-cable operators.
PLUME: Building a Network-Native Foundation Model for Wireless Traces via Protocol-Aware TokenizationSwadhin Pradhan, Shazal Irshad, Jerome Henry2026-03-13下载Foundation models succeed when they learn in the native structure of a modality, whether morphology-respecting tokens in language or pixels in vision.
SAIL: Unsupervised Spatial-Angular Interpretable Feature Learning for RF Map SynthesisSopan Sarkar, Marwan Krunz2026-03-13下载In wireless networks, radio-frequency (RF) maps are critical for tasks such as capacity planning, coverage estimation, and localization. Traditional approaches for obtaining RF maps, including site su...
AI-Enabled Bit-Mapping Medium Access Control Protocol for Intelligent and Energy-Efficient IoT NetworksJesmine Damilola Omonori, Iyanu Tomiwa Durotola, Godspower Paul Osilama2026-03-13下载Energy-efficient medium access control (MAC) protocols remain critical in resource-constrained Wireless Sensor Networks (WSNs) and IoT deployments, especially under mixed traffic patterns that combine...
End-to-End O-RAN Testbed for Edge-AI-Enabled 5G/6G Connected Industrial RoboticsSasa Talosi, Vladimir Vincan, Srdjan Sobot, Goran Martic, Vladimir Morosev, Vukan Ninkovic, Dragisa Miskovic, Dejan Vukobratovic2026-03-13下载Connected robotics is one of the principal use cases driving the transition towards more intelligent and capable 6G mobile cellular networks. Replacing wired connections with highly reliable, high-thr...
Goal-Oriented Learning at the Edge: Graph Neural Networks Over-the-Air for Blockage PredictionLorenzo Mario Amorosa, Zhan Gao, Tony Chahoud, Yiqun Wu, Lukas Eller, Marco Skocaj, Roberto Verdone2026-03-13下载Sixth-generation (6G) wireless networks evolve from connecting devices to connecting intelligence. The focus turns to Goal-Oriented Communications, where the effectiveness of communication is assessed...
Semantic-Aware 6G Network Management through Knowledge-Defined NetworkingTuğçe Bilen, Ian F. Akyildiz2026-03-13下载Semantic communication is emerging as a key paradigm for 6G networks, where the goal is not to perfectly reconstruct bits but to preserve the meaning that matters for a given task.
HyGra: Accelerating Network-State Simulation for LLM Training in DCNs via Adaptive Packet-Flow GranularityWenyi Wang, Zheng Wu, Yanmeng Wang, Haolin Mao, Lei Han, Gaogang Xie, Fu Xiao2026-03-13下载In recent years, large language models (LLMs) have driven substantial intelligent transformation across diverse industries. Commercial LLM training is typically performed over data center networks (DC...
Evaluation of TCP Congestion Control for Public High-Performance Wide-Area NetworksFatih Berkay Sarpkaya, Andrea Francini, Bilgehan Erman, Shivendra Panwar2026-03-13下载Practitioners of a growing number of scientific and artificial-intelligence (AI) applications use High-Performance Wide-Area Networks (HP-WANs) for moving massive data sets between remote facilities.
A Standards-Aligned Coordination Framework for Edge-Enhanced Collaborative Healthcare in 6G NetworksLiuwang Kang, Fan Wang, Yuzhang Huang, Shang Yan, Jianbin Zheng, Wenbin Lei, Konstantin Yakovlev, Jie Tang, Shaoshan Liu2026-03-13下载Mission-critical healthcare applications including real-time intensive care monitoring, ambulance-to-hospital orchestration, and distributed medical imaging inference require workflow-level, time-boun...
OFDM Waveform for Monostatic ISAC in 6G: Vision, Approach, and Research DirectionsHuacheng Zeng, Kunzhe Song, Geo Jie Zhou, Ruxin Lin2026-03-13下载Integrated sensing and communication (ISAC) is widely regarded as a key enabling technology for 6G wireless networks. While extensive research has explored the coexistence of sensing and communication...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
AgentRM: An OS-Inspired Resource Manager for LLM Agent SystemsJianshu She2026-03-13下载Large Language Model (LLM) agent systems have experienced rapid adoption across diverse domains, yet they suffer from critical user experience problems that limit their practical deployment.

基于 VitePress 构建