Skip to content

2026-02-11

cs.AR - Architecture

标题作者发布日期PDF摘要
Metastable Dynamical Computing with Energy Landscapes: A PrimerChristian Z. Pratt, Kyle J. Ray, James P. Crutchfield2026-02-11下载Smartphones, laptops, and data centers are CMOS-based technologies that ushered our world into the information age of the 21st century. Despite their advantages for scalable computing, their implement...
A 16 nm 1.60TOPS/W High Utilization DNN Accelerator with 3D Spatial Data Reuse and Efficient Shared Memory AccessXiaoling Yi, Ryan Antonio, Yunhao Deng, Fanchen Kong, Joren Dumoulin, Jun Yin, Marian Verhelst2026-02-11下载Achieving high compute utilization across a wide range of AI workloads is crucial for the efficiency of versatile DNN accelerators. This paper presents the Voltra chip and its utilization-optimised DN...
HiFloat4 Format for Language Model InferenceYuanyong Luo, Jing Huang, Yu Cheng, Ziwei Yu, Kaihua Tang, Xinda Ma, Xin Wang, Anping Tong, Guipeng Hu, Yun Xu, Mehran Taghian, Peng Wu, Guanglin Li, Yunke Peng, Tianchi Hu, Minqi Chen, Michael Bi Mi, Hu Liu, Xiping Zhou, Junsong Wang, Qiang Lin, Heng Liao2026-02-11下载This paper introduces HiFloat4 (HiF4), a block floating-point data format tailored for deep learning. Each HiF4 unit packs 64 4-bit elements with 32 bits of shared scaling metadata, averaging 4.
Reed-Muller Error-Correction Code Encoder for SFQ-to-CMOS Interface CircuitsYerzhan Mustafa, Berker Peköz, Selçuk Köse2026-02-11下载Data transmission from superconducting digital electronics such as single flux quantum (SFQ) logic to semiconductor (CMOS) circuits is subject to bit errors due to, e.g.
Vulnerabilities in Partial TEE-Shielded LLM Inference with Precomputed NoiseAbhishek Saini, Haolin Jiang, Hang Liu2026-02-11下载The deployment of large language models (LLMs) on third-party devices requires new ways to protect model intellectual property. While Trusted Execution Environments (TEEs) offer a promising solution, ...
From Buffers to Registers: Unlocking Fine-Grained FlashAttention with Hybrid-Bonded 3D NPU Co-DesignJinxin Yu, Yudong Pan, Mengdi Wang, Huawei Li, Yinhe Han, Xiaowei Li, Ying Wang2026-02-11下载Transformer-based models dominate modern AI workloads but exacerbate memory bottlenecks due to their quadratic attention complexity and ever-growing model sizes.
Fault Tolerant Design of IGZO-based Binary Search ADCsPaula Carolina Lozano Duarte, Sule Ozev, Mehdi Tahoori2026-02-11下载Thin-film technologies such as Indium Gallium Zinc Oxide (IGZO) enable Flexible Electronics (FE) for emerging applications in wearable sensing, personal health monitoring, and large-area systems.
LOREN: Low Rank-Based Code-Rate Adaptation in Neural ReceiversBram Van Bolderik, Vlado Menkovski, Sonia Heemstra de Groot, Manil Dev Gomony2026-02-11下载Neural network based receivers have recently demonstrated superior system-level performance compared to traditional receivers. However, their practicality is limited by high memory and power requireme...
DRAMPyML: A Formal Description of DRAM Protocols with Timed Petri NetsDerek Christ, Thomas Zimmermann, Philippe Barbie, Dmitri Saberi, Yao Yin, Matthias Jung2026-02-11下载The JEDEC committee defines various domain-specific DRAM standards. These standards feature increasingly complex and evolving protocol specifications, which are detailed in timing diagrams and command...
Scaling Routers with In-Package Optics and High-Bandwidth MemoriesIsaac Keslassy, Ilay Yavlovich, Jose Yallouz, Tzu-Chien Hsueh, Yeshaiahu Fainman, Bill Lin2026-02-11下载This paper aims to apply two major scaling transformations from the computing packaging industry to internet routers: the heterogeneous integration of high-bandwidth memories (HBMs) and chiplets, as w...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Real Life Is Uncertain. Consensus Should Be Too!Reginald Frank, Soujanya Ponnapalli, Octavio Lomeli, Neil Giridharan, Marcos K Aguilera, Natacha Crooks2026-02-11下载Modern distributed systems rely on consensus protocols to build a fault-tolerant-core upon which they can build applications. Consensus protocols are correct under a specific failure model, where up t...
Min-Sum Uniform Coverage Problem by Autonomous Mobile RobotsAnimesh Maiti, Abhinav Chakraborty, Bibhuti Das, Subhash Bhagat, Krishnendu Mukhopadhyaya2026-02-11下载We study the \textit{min-sum uniform coverage} problem for a swarm of nn mobile robots on a given finite line segment and on a circle having finite positive radius, where the circle is given as an in...
Fine-Tuning GPT-5 for GPU Kernel GenerationAli Tehrani, Yahya Emara, Essam Wissam, Wojciech Paluch, Waleed Atallah, Łukasz Dudziak, Mohamed S. Abdelfattah2026-02-11下载Developing efficient GPU kernels is essential for scaling modern AI systems, yet it remains a complex task due to intricate hardware architectures and the need for specialized optimization expertise.
Ask the Expert: Collaborative Inference for Vision Transformers with Near-Edge AcceleratorsHao Liu, Suhaib A. Fahmy2026-02-11下载Deploying Vision Transformers on edge devices is challenging due to their high computational complexity, while full offloading to cloud resources presents significant latency overheads.
BOute: Cost-Efficient LLM Serving with Heterogeneous LLMs and GPUs via Multi-Objective Bayesian OptimizationYouhe Jiang, Fangcheng Fu, Eiko Yoneki2026-02-11下载The rapid growth of large language model (LLM) deployments has made cost-efficient serving systems essential. Recent efforts to enhance system cost-efficiency adopt two main perspectives: (i) An algor...
Computing Least Fixed Points with Overwrite Semantics in Parallel and Distributed SystemsVijay K. Garg, Rohan Garg2026-02-11下载We present methods to compute least fixed points of multiple monotone inflationary functions in parallel and distributed settings. While the classic Knaster-Tarski theorem addresses a single function ...
Authenticated Workflows: A Systems Approach to Protecting Agentic AIMohan Rajagopalan, Vinay Rao2026-02-11下载Agentic AI systems automate enterprise workflows but existing defenses--guardrails, semantic filters--are probabilistic and routinely bypassed.
Chamfer-Linkage for Hierarchical Agglomerative ClusteringKishen N Gowda, Willem Fletcher, MohammadHossein Bateni, Laxman Dhulipala, D Ellis Hershkowitz, Rajesh Jayaram, Jakub Łącki2026-02-11下载Hierarchical Agglomerative Clustering (HAC) is a widely-used clustering method based on repeatedly merging the closest pair of clusters, where inter-cluster distances are determined by a linkage funct...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Multi Layer Protection Against Low Rate DDoS Attacks in Containerized SystemsAhmad Fareed, Bilal Al Habib, Anne Pepita Francis2026-02-11下载Low rate Distributed Denial of Service DDoS attacks have emerged as a major threat to containerized cloud infrastructures. Due to their low traffic volumes, these attacks can be difficult to detect an...
WHEREIS: IP Address Registration Geo-ConsistencyRobert Beverly, Amreesh Phokeer, Oliver Gasser2026-02-11下载The five Regional Internet Registries (RIRs) provide the critical function of IP address resource del egation and registration. The accuracy of registration data directly impacts Internet operation, m...
A Robust Optimization Approach for Regenerator Placement in Fault-Tolerant Networks Under Discrete Cost UncertaintyMohammad Khosravi, Setareh Maghsudi2026-02-11下载We focus on robust, survivable communication networks, where network links and nodes are affected by an uncertainty set. In this sense, any network links might fail.
AI Infrastructure SovereigntySergio Cruzes2026-02-11下载Artificial intelligence has shifted from a software-centric discipline to an infrastructure-driven system. Large-scale training and inference increasingly depend on tightly coupled data centers, high-...
Less is More: The Dilution Effect in Multi-Link Wireless SensingKarim Khamaisi, Bruno Rodrigues2026-02-11下载Wireless sensing approaches promise to transform smart infrastructures into privacy-preserving motion detectors, yet commercial adoption remains limited.
Security, Privacy and System-Level Resillience of 6G End-to-End System: Hexa-X-II PerspectivePawani Porambage, Diego Lopez, Antonio Pastor, Bin Han, José María Jorquera Valero, Manuel Gil Pérez, Noelia Pérez Palma, Antonio Skarmeta, Prajnamaya Dass, Stefan Köpsell, Sonika Ujjwal, Javier José Díaz Rivera, Pol Alemany, Raul Muñoz, Jafar Mohammadi, Chaitanya Aggarwal, Betul Guvenc Paltun, Ferhat Karakoc2026-02-11下载The sixth generation (6G) of mobile networks are being developed to overcome limitations in previous generations and meet emerging user demands.
Supercharging Packet-level Network Simulation of Large Model Training via Memoization and Fast-ForwardingFei Long, Kaihui Gao, Li Chen, Dan Li, Yiwei Zhang, Fei Gui, Yitao Xing, Wenjia Wei, Bingyang Liu2026-02-11下载Packet-level discrete-event simulation (PLDES) is a prevalent tool for evaluating detailed performance of large model training. Although PLDES offers high fidelity and generality, its slow performance...
SplitCom: Communication-efficient Split Federated Fine-tuning of LLMs via Temporal CompressionTao Li, Yulin Tang, Yiyang Song, Cong Wu, Xihui Liu, Pan Li, Xianhao Chen2026-02-11下载Federated fine-tuning of on-device large language models (LLMs) mitigates privacy concerns by preventing raw data sharing. However, the intensive computational and memory demands pose significant chal...
Predictive-State Communication: Innovation Coding and Reconciliation under DelayOzgur Ercetin, Mohaned Chraiti2026-02-11下载Shannon theory models communication as the reliable transfer of symbol sequences, with performance governed by capacity and rate-distortion limits.
Scaling Routers with In-Package Optics and High-Bandwidth MemoriesIsaac Keslassy, Ilay Yavlovich, Jose Yallouz, Tzu-Chien Hsueh, Yeshaiahu Fainman, Bill Lin2026-02-11下载This paper aims to apply two major scaling transformations from the computing packaging industry to internet routers: the heterogeneous integration of high-bandwidth memories (HBMs) and chiplets, as w...
To Reconfigure or Not to Reconfigure: Optimizing All-to-All Collectives in Circuit-Switched Photonic InterconnectsAnchengcheng Zhou, Vamsi Addanki, Maria Apostolaki2026-02-11下载All-to-all collective communication is a core primitive in distributed machine learning and high-performance computing. At the server scale, the communication demands of these workloads are increasing...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Hardening the OSv Unikernel with Efficient Address Randomization: Design and Performance EvaluationAlex Wollman, John Hastings2026-02-11下载Unikernels are single-purpose library operating systems that run the kernel and application in one address space, but often omit security mitigations such as address space layout randomization (ASLR).

cs.PF - Performance

标题作者发布日期PDF摘要
Resource-Efficient RGB-Only Action Recognition for Edge DeploymentDongsik Yoon, Jongeun Kim, Dayeon Lee2026-02-11下载Action recognition on edge devices poses stringent constraints on latency, memory, storage, and power consumption. While auxiliary modalities such as skeleton and depth information can enhance recogni...
Supercharging Packet-level Network Simulation of Large Model Training via Memoization and Fast-ForwardingFei Long, Kaihui Gao, Li Chen, Dan Li, Yiwei Zhang, Fei Gui, Yitao Xing, Wenjia Wei, Bingyang Liu2026-02-11下载Packet-level discrete-event simulation (PLDES) is a prevalent tool for evaluating detailed performance of large model training. Although PLDES offers high fidelity and generality, its slow performance...

基于 VitePress 构建