2026-03-05

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Scalable Digital Compute-in-Memory Ising Machines for Robustness Verification of Binary Neural Networks	Madhav Vadlamani, Rahul Singh, Yuyao Kong, Zheng Zhang, Shimeng Yu	2026-03-05	下载	Verification of binary neural network (BNN) robustness is NP-hard, as it can be formulated as a combinatorial search for an adversarial perturbation that induces misclassification.
NL2GDS: LLM-aided interface for Open Source Chip Design	Max Eland, Jeyan Thiyagalingam, Dinesh Pamunuwa, Roshan Weerasekera	2026-03-05	下载	The growing complexity of hardware design and the widening gap between high-level specifications and register-transfer level (RTL) implementation hinder rapid prototyping and system design.
Network Design for Wafer-Scale Systems with Wafer-on-Wafer Hybrid Bonding	Patrick Iff, Tommaso Bonato, Maciej Besta, Luca Benini, Torsten Hoefler	2026-03-05	下载	Transformer-based large language models are increasingly constrained by data movement as communication bandwidth drops sharply beyond the chip boundary.
AI+HW 2035: Shaping the Next Decade	Deming Chen, Jason Cong, Azalia Mirhoseini, Christos Kozyrakis, Subhasish Mitra, Jinjun Xiong, Cliff Young, Anima Anandkumar, Michael Littman, Aron Kirschen, Sophia Shao, Serge Leef, Naresh Shanbhag, Dejan Milojicic, Michael Schulte, Gert Cauwenberghs, Jerry M. Chow, Tri Dao, Kailash Gopalakrishnan, Richard Ho, Hoshik Kim, Kunle Olukotun, David Z. Pan, Mark Ren, Dan Roth, Aarti Singh, Yizhou Sun, Yusu Wang, Yann LeCun, Ruchir Puri	2026-03-05	下载	Artificial intelligence (AI) and hardware (HW) are advancing at unprecedented rates, yet their trajectories have become inseparably intertwined.
Diagnosing FP4 inference: a layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4	Musa Cim, Burak Topcu, Mahmut Taylan Kandemir	2026-03-05	下载	Quantization addresses the high resource demand for large language models (LLMs) by alleviating memory pressure and bandwidth congestion and providing significantly scaled compute power with a tolerab...
MCEL: Margin-Based Cross-Entropy Loss for Error-Tolerant Quantized Neural Networks	Mikail Yayla, Akash Kumar	2026-03-05	下载	Robustness to bit errors is a key requirement for the reliable use of neural networks (NNs) on emerging approximate computing platforms and error-prone memory technologies.
VMXDOTP: A RISC-V Vector ISA Extension for Efficient Microscaling (MX) Format Acceleration	Max Wipfli, Gamze İslamoğlu, Navaneeth Kunhi Purayil, Angelo Garofalo, Luca Benini	2026-03-05	下载	Compared to the first generation of deep neural networks, dominated by regular, compute-intensive kernels such as matrix multiplications (MatMuls) and convolutions, modern decoder-based transformers i...
Hardware-Software Co-design for 3D-DRAM-based LLM Serving Accelerator	Cong Li, Yihan Yin, Chenhao Xue, Zhao Wang, Fujun Bai, Yixin Guo, Xiping Jiang, Qiang Wu, Yuan Xie, Guangyu Sun	2026-03-05	下载	Large language models (LLMs) have been widely deployed for online generative services, where numerous LLM instances jointly handle workloads with fluctuating request arrival rates and variable request...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
A Lock-Free Work-Stealing Algorithm for Bulk Operations	Raja Sai Nandhan Yadav Kataru, Danial Davarnia, Ali Jannesari	2026-03-05	下载	Work-stealing is a widely used technique for balancing irregular parallel workloads, and most modern runtime systems adopt lock-free work-stealing deques to reduce contention and improve scalability.
Parallelization Strategies for Dense LLM Deployment: Navigating Through Application-Specific Tradeoffs and Bottlenecks	Burak Topcu, Musa Oguzhan Cim, Poovaiah Palangappa, Meena Arunachalam, Mahmut Taylan Kandemir	2026-03-05	下载	Breakthroughs in the generative AI domain have fueled an explosion of large language model (LLM)-powered applications, whose workloads fundamentally consist of sequences of inferences through transfor...
Why Ethereum Needs Fairness Mechanisms that Do Not Depend on Participant Altruism	Patrick Spiesberger, Nils Henrik Beyer, Hannes Hartenstein	2026-03-05	下载	Ethereum's ideals of decentralization and censorship resistance are undermined in practice, motivating ongoing efforts to reestablish these properties.
The Need for Quantitative Resilience Models and Metrics in Classical-Quantum Computing Systems	Santiago Núñez-Corrales	2026-03-05	下载	Increasingly deeper integration of HPC resources and QPUs unveils new challenges in computer architecture and engineering. As a consequence, dependability arises again as a concern encompassing resili...
Radiation Hydrodynamics at Scale: Comparing MPI and Asynchronous Many-Task Runtimes with FleCSI	Alexander Strack, Hartmut Kaiser, Dirk Pflüger	2026-03-05	下载	Writing efficient distributed code remains a labor-intensive and complex endeavor. To simplify application development, the Flexible Computational Science Infrastructure (FleCSI) framework offers a us...
A monitoring system for collecting and aggregating metrics from distributed clouds	Tamara Ranković, Mateja Rilak, Janko Rakonjac, Miloš Simić	2026-03-05	下载	Applications requiring real-time processing of large volumes of data have been the main driver for rethinking the traditional cloud, giving rise to novel cloud models.
Scaling Real-Time Traffic Analytics on Edge-Cloud Fabrics for City-Scale Camera Networks	Akash Sharma, Pranjal Naman, Roopkatha Banerjee, Priyanshu Pansari, Sankalp Gawali, Mayank Arya, Sharath Chandra, Arun Josephraj, Rakshit Ramesh, Punit Rathore, Anirban Chakraborty, Raghu Krishnapuram, Vijay Kovvali, Yogesh Simmhan	2026-03-05	下载	Real-time city-scale traffic analytics requires processing 100s-1000s of CCTV streams under strict latency, bandwidth, and compute limits. We present a scalable AI-driven Intelligent Transportation Sy...
Leveraging Structural Knowledge for Solving Election in Anonymous Networks with Shared Randomness	Jérémie Chalopin, Emmanuel Godard	2026-03-05	下载	We study the classical Election problem in anonymous net- works, where solutions can rely on the use of random bits, which may be either shared or unshared among nodes.
PromptTuner: SLO-Aware Elastic System for LLM Prompt Tuning	Wei Gao, Peng Sun, Dmitrii Ustiugov, Tianwei Zhang, Yonggang Wen	2026-03-05	下载	Prompt tuning has become a prominent strategy for enhancing the performance of Large Language Models (LLMs) on downstream tasks. Many IT enterprises now offer Prompt-Tuning-as-a-Service to fulfill the...
FluxSieve: Unifying Streaming and Analytical Data Planes for Scalable Cloud Observability	Adriano Vogel, Sören Henning, Otmar Ertl	2026-03-05	下载	Despite many advances in query optimization, indexing techniques, and data storage, modern data platforms still face difficulties in delivering robust query performance under high concurrency and comp...
The Semantic Arrow of Time, Part V: The Leibniz Bridge -- Toward a Unified Theory of Semantic Time	Paul Borrill	2026-03-05	下载	This is the final paper in the five-part series The Semantic Arrow of Time. Part I identified the FITO category mistake -- treating forward temporal flow as sufficient for establishing meaning.
The Semantic Arrow of Time, Part IV: Why Transactions Fail	Paul Borrill	2026-03-05	下载	This is the fourth of five papers comprising The Semantic Arrow of Time. Parts I-III established that computing's hidden arrow of time is semantic rather than thermodynamic, that bilateral transaction...
Unlocking Python's Cores: Hardware Usage and Energy Implications of Removing the GIL	José Daniel Montoya Salazar	2026-03-05	下载	Python's Global Interpreter Lock prevents execution on more than one CPU core at the same time, even when multiple threads are used. However, starting with Python 3.
The Semantic Arrow of Time, Part III: RDMA and the Completion Fallacy	Paul Borrill	2026-03-05	下载	This is the third of five papers comprising The Semantic Arrow of Time. Parts I and II identified computing's hidden semantic arrow of time, the FITO category mistake, and presented the constructive a...
SLO-Aware Compute Resource Allocation for Prefill-Decode Disaggregated LLM Inference	Luchang Li, Dongfang Li, Bozhao Gong, Yu Zhang	2026-03-05	下载	Prefill-Decode (P/D) disaggregation has emerged as a widely adopted optimization strategy for Large Language Model (LLM) inference. However, there currently exists no well-established methodology for ...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Tureis: Transformer-based Unified Resilience for IoT Devices in Smart Homes	Alireza Borhani, Vafa Andalibi, Bahar Asgari	2026-03-05	下载	Smart-home IoT systems rely on heterogeneous sensor networks whose correctness shapes application behavior and the physical environment. However, these low-cost, resource-constrained sensors are highl...
V2N-Based Algorithm and Communication Protocol for Autonomous Non-Stop Intersections	Lorenzo Farina, Lorenzo Mario Amorosa, Marco Rapelli, Barbara Maví Masini, Claudio Casetti, Alessandro Bazzi	2026-03-05	下载	Intersections are critical areas for road safety and traffic efficiency, accounting for a significant portion of vehicle crashes and fatalities.
Analysis of Proactive Uncoordinated Techniques to Mitigate Interference in FMCW Automotive Radars	Alessandro Bazzi, Francesco Miccoli, Fabrizio Cuccoli, Luca Facheris, Vincent Martinez	2026-03-05	下载	Modern vehicles increasingly rely on advanced driver-assistance systems (ADAS), with radars playing a key role due to their cost-effectiveness and reliable performance.
U-Parking: Distributed UWB-Assisted Autonomous Parking System with Robust Localization and Intelligent Planning	Yiang Wu, Qiong Wu, Pingyi Fan, Kezhi Wang, Wen Chen, Guoqiang Mao, Khaled B. Letaief	2026-03-05	下载	This demonstration presents U-Parking, a distributed Ultra-Wideband (UWB)-assisted autonomous parking system. By integrating Large Language Models (LLMs)-assisted planning with robust fusion localizat...
Adaptive Personalized Federated Reinforcement Learning for RIS-Assisted Aerial Relays in SAGINs with Fluid Antennas	Yuxuan Yang, Bin Lyu, Abbas Jamalipour	2026-03-05	下载	Space-air-ground integrated networks (SAGINs) interconnect satellites, uncrewed aerial vehicles (UAVs), and ground devices to enable flexible and ubiquitous wireless services.
Selfish Cooperation Towards Low-Altitude Economy: Integrated Multi-Service Deployment with Resilient Federated Reinforcement Learning	Yuxuan Yang, Bin Lyu, Abbas Jamalipour	2026-03-05	下载	The low-altitude economy (LAE) is a rapidly emerging paradigm that builds a service-centric economic ecosystem through large-scale and sustainable uncrewed aerial vehicle (UAV)-enabled service provisi...
Body-scale NFC for wearables: human-centric body-scale NFC networking for ultra-low-power wearable devices (Demo of UTokyo Kawahara Lab 2025)	Hideaki Yamamoto, Yifan Li, Wakako Yukita, Tomoyuki Yokota, Takao Someya, Ryo Takahashi, Yoshihiro Kawahara	2026-03-05	下载	Near Field Communication (NFC) is a promising technology for ultra-low-power wearables, yet its short communication range limits its use to narrow-area, point-to-point interactions.
Quantum Algorithms for Network Signal Coordination	Vinayak Dixit, Richard Pech	2026-03-05	下载	There has been increasing interest in developing efficient quantum algorithms for hard classical problems. The Network Signal Coordination (NSC) problem is one such problem known to be NP complete.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Parallelization Strategies for Dense LLM Deployment: Navigating Through Application-Specific Tradeoffs and Bottlenecks	Burak Topcu, Musa Oguzhan Cim, Poovaiah Palangappa, Meena Arunachalam, Mahmut Taylan Kandemir	2026-03-05	下载	Breakthroughs in the generative AI domain have fueled an explosion of large language model (LLM)-powered applications, whose workloads fundamentally consist of sequences of inferences through transfor...
FluxSieve: Unifying Streaming and Analytical Data Planes for Scalable Cloud Observability	Adriano Vogel, Sören Henning, Otmar Ertl	2026-03-05	下载	Despite many advances in query optimization, indexing techniques, and data storage, modern data platforms still face difficulties in delivering robust query performance under high concurrency and comp...
Rethinking Temporal Models for TinyML: LSTM versus 1D-CNN in Resource-Constrained Devices	Bidyut Saha, Riya Samanta	2026-03-05	下载	Time series classification underpins applications such as human activity recognition, healthcare monitoring, and gesture detection in the IoT domain.
Unlocking Python's Cores: Hardware Usage and Energy Implications of Removing the GIL	José Daniel Montoya Salazar	2026-03-05	下载	Python's Global Interpreter Lock prevents execution on more than one CPU core at the same time, even when multiple threads are used. However, starting with Python 3.