Skip to content

2026-01-27

cs.AR - Architecture

标题作者发布日期PDF摘要
How Much Progress Has There Been in NVIDIA Datacenter GPUs?Emanuele Del Sozzo, Martin Fleming, Kenneth Flamm, Neil Thompson2026-01-27下载Graphics Processing Units (GPUs) are the state-of-the-art architecture for essential tasks, ranging from rendering 2D/3D graphics to accelerating workloads in supercomputing centers and, of course, Ar...
A Paradigm for Generalized Multi-Level Priority EncodersMaxwell Phillips, Firas Hassan, Ahmed Ammar2026-01-27下载Priority encoders are typically considered expensive hardware components in terms of complexity, especially at high bit precisions or input lengths (e.g., above 512 bits).
Primitive-Driven Acceleration of Hyperdimensional Computing for Real-Time Image ClassificationDhruv Parikh, Jebacyril Arockiaraj, Viktor Prasanna2026-01-27下载Hyperdimensional Computing (HDC) represents data using extremely high-dimensional, low-precision vectors, termed hypervectors (HVs), and performs learning and inference through lightweight, noise-tole...
Veri-Sure: A Contract-Aware Multi-Agent Framework with Temporal Tracing and Formal Verification for Correct RTL Code GenerationJiale Liu, Taiyu Zhou, Tianqi Jiang2026-01-27下载In the rapidly evolving field of Electronic Design Automation (EDA), the deployment of Large Language Models (LLMs) for Register-Transfer Level (RTL) design has emerged as a promising direction.
GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read MappingJulien Eudine, Chu Li, Zhuo Cheng, Renzo Andri, Can Firtina, Mohammad Sadrosadati, Nika Mansouri Ghiasi, Konstantina Koliogeorgi, Anirban Nag, Arash Tavakkol, Haiyu Mao, Onur Mutlu, Shai Bergman, Ji Zhang2026-01-27下载Genome sequencing has become a central focus in computational biology. A genome study typically begins with sequencing, which produces millions to billions of short DNA fragments known as reads.
A Reconfigurable Framework for AI-FPGA Agent Integration and AccelerationAybars Yunusoglu, Talha Coskun, Hiruna Vishwamith, Murat Isik, I. Can Dikmen2026-01-27下载Artificial intelligence (AI) is increasingly deployed in real-time and energy-constrained environments, driving demand for hardware platforms that can deliver high performance and power efficiency.
M2XFP: A Metadata-Augmented Microscaling Data Format for Efficient Low-bit QuantizationWeiming Hu, Zihan Zhang, Haoyan Zhang, Chen Zhang, Cong Guo, Yu Feng, Tianchi Hu, Guanglin Li, Guipeng Hu, Junsong Wang, Jingwen Leng2026-01-27下载Existing low-bit Microscaling (MX) formats, such as MXFP4, often suffer from substantial accuracy degradation due to the use of a shared scaling factor with the Power-of-Two format.
In-Network Collective Operations: Game Changer or Challenge for AI Workloads?Torsten Hoefler, Mikhail Khalilov, Josiah Clark, Surendra Anubolu, Mohan Kalkunte, Karen Schramm, Eric Spada, Duncan Roweth, Keith Underwood, Adrian Caulfield, Abdul Kabbani, Amirreza Rastegari2026-01-27下载This paper summarizes the opportunities of in-network collective operations (INC) for accelerated collective operations in AI workloads. We provide sufficient detail to make this important field acces...
Probabilistic Sensing: Intelligence in Data SamplingIbrahim Albulushi, Saleh Bunaiyan, Suraj S. Cheema, Hesham ElSawy, Feras Al-Dirini2026-01-27下载Extending the intelligence of sensors to the data-acquisition process - deciding whether to sample or not - can result in transformative energy-efficiency gains.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
A Data-Informed Local Subspaces Method for Error-Bounded Lossy Compression of Large-Scale Scientific DatasetsArshan Khan, Rohit Deshmukh, Ben O'Neill2026-01-27下载The growing volume of scientific simulation data presents a significant challenge for storage and transfer. Error-bounded lossy compression has emerged as a critical solution for mitigating these chal...
Mapping Gemma3 onto an Edge Dataflow ArchitectureShouyu Du, Miaoxiang Yu, Zhenyu Xu, Zhiheng Ni, Jillian Cai, Qing Yang, Tao Wei2026-01-27下载We present the first end-to-end deployment of the Gemma3 family of large language and vision models on a tiled edge dataflow architecture (AMD Ryzen AI NPU).
Delta Fair Sharing: Performance Isolation for Multi-Tenant Storage SystemsTyler Griggs, Soujanya Ponnapalli, Dev Bali, Wenjie Ma, James DeLoye, Audrey Cheng, Jaewan Hong, Natacha Crooks, Scott Shenker, Ion Stoica, Matei Zaharia2026-01-27下载Modern storage systems, often deployed to support multiple tenants in the cloud, must provide performance isolation. Unfortunately, traditional approaches such as fair sharing do not provide performan...
Enabling SSI-Compliant Use of EUDI Wallet Credentials through Trusted Execution Environment and Zero-Knowledge ProofNacereddine Sitouah, Francesco Bruschi, Stefano De Cillis2026-01-27下载The passing of the eIDAS amendment marks an important milestone for EU countries and changes how they must manage digital credentials for both public services and businesses.
Self-Sovereign Identity and eIDAS 2.0: An Analysis of Control, Privacy, and Legal ImplicationsNacereddine Sitouah, Marco Esposito, Francesco Bruschi2026-01-27下载European digital identity initiatives are grounded in regulatory frameworks designed to ensure interoperability and robust, harmonized security standards.
Knowledge-Aware Evolution for Streaming Federated Continual Learning with Category Overlap and without Task IdentifiersSixing Tan, Xianmin Liu2026-01-27下载Federated Continual Learning (FCL) leverages inter-client collaboration to balance new knowledge acquisition and prior knowledge retention in non-stationary data.
Accelerating radio astronomy imaging with RICK: a step towards SKA-Mid and SKA-LowGiovanni Lacopo, Emanuele De Rubeis, Claudio Gheller, Giuliano Taffoni, Luca Tornatore2026-01-27下载The data volumes generated by modern radio interferometers, such as the SKA precursors, present significant computational challenges for imaging pipelines.
Convex Hull 3D Filtering with GPU Ray Tracing and Tensor CoresRoberto Carrasco, Enzo Meneses, Hector Ferrada, Cristobal A. Navarro, Nancy Hitschfeld2026-01-27下载In recent years, applications such as real-time simulations, autonomous systems, and video games increasingly demand the processing of complex geometric models under stringent time constraints.
Modular Foundation Model Inference at the Edge: Network-Aware Microservice OptimizationJuan Zhu, Zixin Wang, Shenghui Song, Jun Zhang, Khaled Ben Letaief2026-01-27下载Foundation models (FMs) unlock unprecedented multimodal and multitask intelligence, yet their cloud-centric deployment precludes real-time responsiveness and compromises user privacy.
NET4EXA: Pioneering the Future of Interconnects for Supercomputing and AIMichele Martinelli, Roberto Ammendola, Andrea Biagioni, Carlotta Chiarini, Ottorino Frezza, Francesca Lo Cicero, Alessandro Lonardo, Pier Stanislao Paolucci, Elena Pastorelli, Pierpaolo Perticaroli, Luca Pontisso, Cristian Rossi, Francesco Simula, Piero Vicini, David Colin, Grégoire Pichon, Alexandre Louvet, John Gliksberg, Claire Chen, Matteo Turisini, Andrea Monterubbiano, Jean-Philippe Nominé, Denis Dutoit, Hugo Taboada, Lilia Zaourar, Mohamed Benazouz, Angelos Bilas, Fabien Chaix, Manolis Katevenis, Nikolaos Chrysos, Evangelos Mageiropoulos, Christos Kozanitis, Thomas Moen, Steffen Persvold, Einar Rustad, Sandro Fiore, Fabrizio Granelli, Simone Pezzuto, Raffaello Potestio, Luca Tubiana, Philippe Velha, Flavio Vella, Daniele De Sensi, Salvatore Pontarelli2026-01-27下载NET4EXA aims to develop a next-generation high-performance interconnect for HPC and AI systems, addressing the increasing demands of large-scale infrastructures, such as those required for training La...
Decentralized Nonsmooth Nonconvex Optimization with Client SamplingXinyan Chen, Weiguo Gao, Luo Luo2026-01-27下载This paper considers decentralized nonsmooth nonconvex optimization problem with Lipschitz continuous local functions. We propose an efficient stochastic first-order method with client sampling, achie...
Revisiting Parameter Server in LLM Post-TrainingXinyi Wan, Penghui Qi, Guangxing Huang, Chaoyi Ruan, Min Lin, Jialin Li2026-01-27下载Modern data parallel (DP) training favors collective communication over parameter servers (PS) for its simplicity and efficiency under balanced workloads.
KUBEDIRECT: Unleashing the Full Power of the Cluster Manager for Serverless ComputingSheng Qi, Zhiquan Zhang, Xuanzhe Liu, Xin Jin2026-01-27下载FaaS platforms rely on cluster managers like Kubernetes for resource management. Kubernetes is popular due to its state-centric APIs that decouple the control plane into modular controllers.
Native LLM and MLLM Inference at Scale on Apple SiliconWayner Barrios2026-01-27下载The growing adoption of Apple Silicon for machine learning development has created demand for efficient inference solutions that leverage its unique unified memory architecture.
Axe: A Simple Unified Layout Abstraction for Machine Learning CompilersBohan Hou, Hongyi Jin, Guanjie Wang, Jinqi Chen, Yaxing Cai, Lijie Yang, Zihao Ye, Yaoyao Ding, Ruihang Lai, Tianqi Chen2026-01-27下载Scaling modern deep learning workloads demands coordinated placement of data and compute across device meshes, memory hierarchies, and heterogeneous accelerators.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Quantum Takes Flight: Two-Stage Resilient Topology Optimization for UAV NetworksHuixiang Zhang, Mahzabeen Emu, Octavia A. Dobre2026-01-27下载Next-generation Unmanned Aerial Vehicle (UAV) communication networks must maintain reliable connectivity under rapid topology changes, fluctuating link quality, and time-critical data exchange.
An Agentic AI Control Plane for 6G Network Slice Orchestration, Monitoring, and TradingEranga Bandara, Ross Gore, Sachin Shetty, Ravi Mukkamala, Tharaka Hewa, Abdul Rahman, Xueping Liang, Safdar H. Bouk, Amin Hass, Peter Foytik, Ng Wee Keong, Kasun De Zoysa2026-01-27下载6G networks are expected to be AI-native, intent-driven, and economically programmable, requiring fundamentally new approaches to network slice orchestration.
NET4EXA: Pioneering the Future of Interconnects for Supercomputing and AIMichele Martinelli, Roberto Ammendola, Andrea Biagioni, Carlotta Chiarini, Ottorino Frezza, Francesca Lo Cicero, Alessandro Lonardo, Pier Stanislao Paolucci, Elena Pastorelli, Pierpaolo Perticaroli, Luca Pontisso, Cristian Rossi, Francesco Simula, Piero Vicini, David Colin, Grégoire Pichon, Alexandre Louvet, John Gliksberg, Claire Chen, Matteo Turisini, Andrea Monterubbiano, Jean-Philippe Nominé, Denis Dutoit, Hugo Taboada, Lilia Zaourar, Mohamed Benazouz, Angelos Bilas, Fabien Chaix, Manolis Katevenis, Nikolaos Chrysos, Evangelos Mageiropoulos, Christos Kozanitis, Thomas Moen, Steffen Persvold, Einar Rustad, Sandro Fiore, Fabrizio Granelli, Simone Pezzuto, Raffaello Potestio, Luca Tubiana, Philippe Velha, Flavio Vella, Daniele De Sensi, Salvatore Pontarelli2026-01-27下载NET4EXA aims to develop a next-generation high-performance interconnect for HPC and AI systems, addressing the increasing demands of large-scale infrastructures, such as those required for training La...
Bridging Visual and Wireless Sensing: A Unified Radiation Field for 3D Radio Map ConstructionChaozheng Wen, Jingwen Tong, Zehong Lin, Chenghong Bian, Jun Zhang2026-01-27下载The emerging applications of next-generation wireless networks (e.g., immersive 3D communication, low-altitude networks, and integrated sensing and communication) necessitate high-fidelity environment...
Enabling SLO-Aware 5G Multi-Access Edge Computing with SMECXiao Zhang, Daehyeok Kim2026-01-27下载Multi-access edge computing (MEC) promises to enable latency-critical applications by bringing computational power closer to mobile devices, but our measurements on commercial MEC deployments reveal f...
In-Network Collective Operations: Game Changer or Challenge for AI Workloads?Torsten Hoefler, Mikhail Khalilov, Josiah Clark, Surendra Anubolu, Mohan Kalkunte, Karen Schramm, Eric Spada, Duncan Roweth, Keith Underwood, Adrian Caulfield, Abdul Kabbani, Amirreza Rastegari2026-01-27下载This paper summarizes the opportunities of in-network collective operations (INC) for accelerated collective operations in AI workloads. We provide sufficient detail to make this important field acces...
UAV-Mounted Aerial Relays in Military Communications: A Comprehensive SurveyFaisal Al-Kamali, Francois Chan, Hussein A. Ammar, James H. Bayes, Claude D'Amours2026-01-27下载Relays are pivotal in military communication networks, expanding coverage and ensuring reliable connectivity in challenging operational environments.
FTA-NTN: Fairness and Throughput Assurance in Non-Terrestrial NetworksSachin Ravikant Trankatwar, Heiko Straulino, Petar Djukic, Burak Kantarci2026-01-27下载Designing optimal non-terrestrial network (NTN) constellations is essential for maximizing throughput and ensuring fair resource distribution.

cs.PF - Performance

标题作者发布日期PDF摘要
In-Network Collective Operations: Game Changer or Challenge for AI Workloads?Torsten Hoefler, Mikhail Khalilov, Josiah Clark, Surendra Anubolu, Mohan Kalkunte, Karen Schramm, Eric Spada, Duncan Roweth, Keith Underwood, Adrian Caulfield, Abdul Kabbani, Amirreza Rastegari2026-01-27下载This paper summarizes the opportunities of in-network collective operations (INC) for accelerated collective operations in AI workloads. We provide sufficient detail to make this important field acces...

基于 VitePress 构建