Appearance
2026-03-05
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Scalable Digital Compute-in-Memory Ising Machines for Robustness Verification of Binary Neural Networks | Madhav Vadlamani, Rahul Singh, Yuyao Kong, Zheng Zhang, Shimeng Yu | 2026-03-05 | 下载 | Verification of binary neural network (BNN) robustness is NP-hard, as it can be formulated as a combinatorial search for an adversarial perturbation that induces misclassification. |
| NL2GDS: LLM-aided interface for Open Source Chip Design | Max Eland, Jeyan Thiyagalingam, Dinesh Pamunuwa, Roshan Weerasekera | 2026-03-05 | 下载 | The growing complexity of hardware design and the widening gap between high-level specifications and register-transfer level (RTL) implementation hinder rapid prototyping and system design. |
| Network Design for Wafer-Scale Systems with Wafer-on-Wafer Hybrid Bonding | Patrick Iff, Tommaso Bonato, Maciej Besta, Luca Benini, Torsten Hoefler | 2026-03-05 | 下载 | Transformer-based large language models are increasingly constrained by data movement as communication bandwidth drops sharply beyond the chip boundary. |
| AI+HW 2035: Shaping the Next Decade | Deming Chen, Jason Cong, Azalia Mirhoseini, Christos Kozyrakis, Subhasish Mitra, Jinjun Xiong, Cliff Young, Anima Anandkumar, Michael Littman, Aron Kirschen, Sophia Shao, Serge Leef, Naresh Shanbhag, Dejan Milojicic, Michael Schulte, Gert Cauwenberghs, Jerry M. Chow, Tri Dao, Kailash Gopalakrishnan, Richard Ho, Hoshik Kim, Kunle Olukotun, David Z. Pan, Mark Ren, Dan Roth, Aarti Singh, Yizhou Sun, Yusu Wang, Yann LeCun, Ruchir Puri | 2026-03-05 | 下载 | Artificial intelligence (AI) and hardware (HW) are advancing at unprecedented rates, yet their trajectories have become inseparably intertwined. |
| Diagnosing FP4 inference: a layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4 | Musa Cim, Burak Topcu, Mahmut Taylan Kandemir | 2026-03-05 | 下载 | Quantization addresses the high resource demand for large language models (LLMs) by alleviating memory pressure and bandwidth congestion and providing significantly scaled compute power with a tolerab... |
| MCEL: Margin-Based Cross-Entropy Loss for Error-Tolerant Quantized Neural Networks | Mikail Yayla, Akash Kumar | 2026-03-05 | 下载 | Robustness to bit errors is a key requirement for the reliable use of neural networks (NNs) on emerging approximate computing platforms and error-prone memory technologies. |
| VMXDOTP: A RISC-V Vector ISA Extension for Efficient Microscaling (MX) Format Acceleration | Max Wipfli, Gamze İslamoğlu, Navaneeth Kunhi Purayil, Angelo Garofalo, Luca Benini | 2026-03-05 | 下载 | Compared to the first generation of deep neural networks, dominated by regular, compute-intensive kernels such as matrix multiplications (MatMuls) and convolutions, modern decoder-based transformers i... |
| Hardware-Software Co-design for 3D-DRAM-based LLM Serving Accelerator | Cong Li, Yihan Yin, Chenhao Xue, Zhao Wang, Fujun Bai, Yixin Guo, Xiping Jiang, Qiang Wu, Yuan Xie, Guangyu Sun | 2026-03-05 | 下载 | Large language models (LLMs) have been widely deployed for online generative services, where numerous LLM instances jointly handle workloads with fluctuating request arrival rates and variable request... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Lock-Free Work-Stealing Algorithm for Bulk Operations | Raja Sai Nandhan Yadav Kataru, Danial Davarnia, Ali Jannesari | 2026-03-05 | 下载 | Work-stealing is a widely used technique for balancing irregular parallel workloads, and most modern runtime systems adopt lock-free work-stealing deques to reduce contention and improve scalability. |
| Parallelization Strategies for Dense LLM Deployment: Navigating Through Application-Specific Tradeoffs and Bottlenecks | Burak Topcu, Musa Oguzhan Cim, Poovaiah Palangappa, Meena Arunachalam, Mahmut Taylan Kandemir | 2026-03-05 | 下载 | Breakthroughs in the generative AI domain have fueled an explosion of large language model (LLM)-powered applications, whose workloads fundamentally consist of sequences of inferences through transfor... |
| Why Ethereum Needs Fairness Mechanisms that Do Not Depend on Participant Altruism | Patrick Spiesberger, Nils Henrik Beyer, Hannes Hartenstein | 2026-03-05 | 下载 | Ethereum's ideals of decentralization and censorship resistance are undermined in practice, motivating ongoing efforts to reestablish these properties. |
| The Need for Quantitative Resilience Models and Metrics in Classical-Quantum Computing Systems | Santiago Núñez-Corrales | 2026-03-05 | 下载 | Increasingly deeper integration of HPC resources and QPUs unveils new challenges in computer architecture and engineering. As a consequence, dependability arises again as a concern encompassing resili... |
| Radiation Hydrodynamics at Scale: Comparing MPI and Asynchronous Many-Task Runtimes with FleCSI | Alexander Strack, Hartmut Kaiser, Dirk Pflüger | 2026-03-05 | 下载 | Writing efficient distributed code remains a labor-intensive and complex endeavor. To simplify application development, the Flexible Computational Science Infrastructure (FleCSI) framework offers a us... |
| A monitoring system for collecting and aggregating metrics from distributed clouds | Tamara Ranković, Mateja Rilak, Janko Rakonjac, Miloš Simić | 2026-03-05 | 下载 | Applications requiring real-time processing of large volumes of data have been the main driver for rethinking the traditional cloud, giving rise to novel cloud models. |
| Scaling Real-Time Traffic Analytics on Edge-Cloud Fabrics for City-Scale Camera Networks | Akash Sharma, Pranjal Naman, Roopkatha Banerjee, Priyanshu Pansari, Sankalp Gawali, Mayank Arya, Sharath Chandra, Arun Josephraj, Rakshit Ramesh, Punit Rathore, Anirban Chakraborty, Raghu Krishnapuram, Vijay Kovvali, Yogesh Simmhan | 2026-03-05 | 下载 | Real-time city-scale traffic analytics requires processing 100s-1000s of CCTV streams under strict latency, bandwidth, and compute limits. We present a scalable AI-driven Intelligent Transportation Sy... |
| Leveraging Structural Knowledge for Solving Election in Anonymous Networks with Shared Randomness | Jérémie Chalopin, Emmanuel Godard | 2026-03-05 | 下载 | We study the classical Election problem in anonymous net- works, where solutions can rely on the use of random bits, which may be either shared or unshared among nodes. |
| PromptTuner: SLO-Aware Elastic System for LLM Prompt Tuning | Wei Gao, Peng Sun, Dmitrii Ustiugov, Tianwei Zhang, Yonggang Wen | 2026-03-05 | 下载 | Prompt tuning has become a prominent strategy for enhancing the performance of Large Language Models (LLMs) on downstream tasks. Many IT enterprises now offer Prompt-Tuning-as-a-Service to fulfill the... |
| FluxSieve: Unifying Streaming and Analytical Data Planes for Scalable Cloud Observability | Adriano Vogel, Sören Henning, Otmar Ertl | 2026-03-05 | 下载 | Despite many advances in query optimization, indexing techniques, and data storage, modern data platforms still face difficulties in delivering robust query performance under high concurrency and comp... |
| The Semantic Arrow of Time, Part V: The Leibniz Bridge -- Toward a Unified Theory of Semantic Time | Paul Borrill | 2026-03-05 | 下载 | This is the final paper in the five-part series The Semantic Arrow of Time. Part I identified the FITO category mistake -- treating forward temporal flow as sufficient for establishing meaning. |
| The Semantic Arrow of Time, Part IV: Why Transactions Fail | Paul Borrill | 2026-03-05 | 下载 | This is the fourth of five papers comprising The Semantic Arrow of Time. Parts I-III established that computing's hidden arrow of time is semantic rather than thermodynamic, that bilateral transaction... |
| Unlocking Python's Cores: Hardware Usage and Energy Implications of Removing the GIL | José Daniel Montoya Salazar | 2026-03-05 | 下载 | Python's Global Interpreter Lock prevents execution on more than one CPU core at the same time, even when multiple threads are used. However, starting with Python 3. |
| The Semantic Arrow of Time, Part III: RDMA and the Completion Fallacy | Paul Borrill | 2026-03-05 | 下载 | This is the third of five papers comprising The Semantic Arrow of Time. Parts I and II identified computing's hidden semantic arrow of time, the FITO category mistake, and presented the constructive a... |
| SLO-Aware Compute Resource Allocation for Prefill-Decode Disaggregated LLM Inference | Luchang Li, Dongfang Li, Bozhao Gong, Yu Zhang | 2026-03-05 | 下载 | Prefill-Decode (P/D) disaggregation has emerged as a widely adopted optimization strategy for Large Language Model (LLM) inference. However, there currently exists no well-established methodology for ... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Tureis: Transformer-based Unified Resilience for IoT Devices in Smart Homes | Alireza Borhani, Vafa Andalibi, Bahar Asgari | 2026-03-05 | 下载 | Smart-home IoT systems rely on heterogeneous sensor networks whose correctness shapes application behavior and the physical environment. However, these low-cost, resource-constrained sensors are highl... |
| V2N-Based Algorithm and Communication Protocol for Autonomous Non-Stop Intersections | Lorenzo Farina, Lorenzo Mario Amorosa, Marco Rapelli, Barbara Maví Masini, Claudio Casetti, Alessandro Bazzi | 2026-03-05 | 下载 | Intersections are critical areas for road safety and traffic efficiency, accounting for a significant portion of vehicle crashes and fatalities. |
| Analysis of Proactive Uncoordinated Techniques to Mitigate Interference in FMCW Automotive Radars | Alessandro Bazzi, Francesco Miccoli, Fabrizio Cuccoli, Luca Facheris, Vincent Martinez | 2026-03-05 | 下载 | Modern vehicles increasingly rely on advanced driver-assistance systems (ADAS), with radars playing a key role due to their cost-effectiveness and reliable performance. |
| U-Parking: Distributed UWB-Assisted Autonomous Parking System with Robust Localization and Intelligent Planning | Yiang Wu, Qiong Wu, Pingyi Fan, Kezhi Wang, Wen Chen, Guoqiang Mao, Khaled B. Letaief | 2026-03-05 | 下载 | This demonstration presents U-Parking, a distributed Ultra-Wideband (UWB)-assisted autonomous parking system. By integrating Large Language Models (LLMs)-assisted planning with robust fusion localizat... |
| Adaptive Personalized Federated Reinforcement Learning for RIS-Assisted Aerial Relays in SAGINs with Fluid Antennas | Yuxuan Yang, Bin Lyu, Abbas Jamalipour | 2026-03-05 | 下载 | Space-air-ground integrated networks (SAGINs) interconnect satellites, uncrewed aerial vehicles (UAVs), and ground devices to enable flexible and ubiquitous wireless services. |
| Selfish Cooperation Towards Low-Altitude Economy: Integrated Multi-Service Deployment with Resilient Federated Reinforcement Learning | Yuxuan Yang, Bin Lyu, Abbas Jamalipour | 2026-03-05 | 下载 | The low-altitude economy (LAE) is a rapidly emerging paradigm that builds a service-centric economic ecosystem through large-scale and sustainable uncrewed aerial vehicle (UAV)-enabled service provisi... |
| Body-scale NFC for wearables: human-centric body-scale NFC networking for ultra-low-power wearable devices (Demo of UTokyo Kawahara Lab 2025) | Hideaki Yamamoto, Yifan Li, Wakako Yukita, Tomoyuki Yokota, Takao Someya, Ryo Takahashi, Yoshihiro Kawahara | 2026-03-05 | 下载 | Near Field Communication (NFC) is a promising technology for ultra-low-power wearables, yet its short communication range limits its use to narrow-area, point-to-point interactions. |
| Quantum Algorithms for Network Signal Coordination | Vinayak Dixit, Richard Pech | 2026-03-05 | 下载 | There has been increasing interest in developing efficient quantum algorithms for hard classical problems. The Network Signal Coordination (NSC) problem is one such problem known to be NP complete. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Parallelization Strategies for Dense LLM Deployment: Navigating Through Application-Specific Tradeoffs and Bottlenecks | Burak Topcu, Musa Oguzhan Cim, Poovaiah Palangappa, Meena Arunachalam, Mahmut Taylan Kandemir | 2026-03-05 | 下载 | Breakthroughs in the generative AI domain have fueled an explosion of large language model (LLM)-powered applications, whose workloads fundamentally consist of sequences of inferences through transfor... |
| FluxSieve: Unifying Streaming and Analytical Data Planes for Scalable Cloud Observability | Adriano Vogel, Sören Henning, Otmar Ertl | 2026-03-05 | 下载 | Despite many advances in query optimization, indexing techniques, and data storage, modern data platforms still face difficulties in delivering robust query performance under high concurrency and comp... |
| Rethinking Temporal Models for TinyML: LSTM versus 1D-CNN in Resource-Constrained Devices | Bidyut Saha, Riya Samanta | 2026-03-05 | 下载 | Time series classification underpins applications such as human activity recognition, healthcare monitoring, and gesture detection in the IoT domain. |
| Unlocking Python's Cores: Hardware Usage and Energy Implications of Removing the GIL | José Daniel Montoya Salazar | 2026-03-05 | 下载 | Python's Global Interpreter Lock prevents execution on more than one CPU core at the same time, even when multiple threads are used. However, starting with Python 3. |