2026-03-27

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Efficient CMOS Invertible Logic Using Stochastic Computing	Sean C. Smithson, Naoya Onizawa, Brett H. Meyer, Warren J. Gross, Takahiro Hanyu	2026-03-27	下载	Invertible logic can operate in one of two modes: 1) a forward mode, in which inputs are presented and a single, correct output is produced, and 2) a reverse mode, in which the output is fixed and the...
Who Checks the Checker? Enhancing Component-level Architectural SEU Fault Tolerance for End-to-End SoC Protection	Michael Rogenmoser, Philippe Sauter, Chen Wu, Angelo Garofalo, Luca Benini	2026-03-27	下载	Single-event upset (SEU) fault tolerance for systems-on-chip (SoCs) in radiation-heavy environments is often addressed by architectural fault-tolerance approaches protecting individual SoC components ...
A Lightweight High-Throughput Collective-Capable NoC for Large-Scale ML Accelerators	Luca Colagrande, Lorenzo Leone, Chen Wu, Tim Fischer, Raphael Roth, Luca Benini	2026-03-27	下载	The exponential increase in Machine Learning (ML) model size and complexity has driven unprecedented demand for high-performance acceleration systems.
Wattchmen: Watching the Wattchers -- High Fidelity, Flexible GPU Energy Modeling	Brandon Tran, Matthias Maiterth, Woong Shin, Matthew D. Sinclair, Shivaram Venkataraman	2026-03-27	下载	Modern GPU-rich HPC systems are increasingly becoming energy-constrained. Thus, understanding an application's energy consumption becomes essential.
VolTune: A Fine-Grained Runtime Voltage Control Architecture for FPGA Systems	Akram Ben Ahmed, Takahiro Hirofuchi, Takaaki Fukai	2026-03-27	下载	The rapid emergence of edge computing platforms and large-scale data centers has made power efficiency a primary design constraint, particularly for data-intensive and AI-driven workloads.
IBEX: Internal Bandwidth-Efficient Compression Architecture for Scalable CXL Memory Expansion	Younghoon Ko, Hyemin Park, Hyuk-Jae Lee, Hyokeun Lee	2026-03-27	下载	As the memory channel count is confined by physical dimensions, memory expanders appear to be a promising approach to extending memory capacity and channels by augmenting the existing I/O interface (e...
RAGnaroX: A Secure, Local-Hosted ChatOps Assistant Using Small Language Models	Benedikt Dornauer, Mircea-Cristian Racasan	2026-03-27	下载	This paper introduces RAGnaroX, a resource-efficient ChatOps assistant that operates entirely on commodity hardware. Unlike existing solutions that often rely on external providers such as Azure or Op...
Per-Bank Memory Bandwidth Regulation for Predictable and Performant Real-Time System	Connor Rudy Sullivan, Amin Mamandipoor, Cole Ridge Strickler, Heechul Yun	2026-03-27	下载	Modern multicore system-on-chips (SoCs) share off-chip DRAM across cores, where bank-level interference can significantly degrade performance and threaten real-time guarantees.
Data Gravity and the Energy Limits of Computation	Wonsuk Lee, Jehoshua Bruck	2026-03-27	下载	Unlike the von Neumann architecture, which separates computation from memory, the brain tightly integrates them, an organization that large language models increasingly resemble.
VeRA+: Vector-Based Lightweight Digital Compensation for Drift-Resilient RRAM In-Memory Computing	Weirong Dong, Kai Zhou, Zhen Kong, Zhengke Yang, Quan Cheng, Haoyuan Li, Junkai Huang, Jun Lan, Yida Li, Masanori Hashimoto, Longyang Lin	2026-03-27	下载	RRAM-based in-memory computing (IMC) offers high energy efficiency but suffers from conductance drift that severely degrades long-term accuracy.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
HFIPay: Privacy-Preserving, Cross-Chain Cryptocurrency Payments to Human-Friendly Identifiers	Jian Sheng Wang	2026-03-27	下载	Sending cryptocurrency to an email address or phone number should be as simple as a bank transfer, yet naive schemes that map identifiers directly to blockchain addresses expose the recipient's balanc...
Fast Topology-Aware Lossy Data Compression with Full Preservation of Critical Points and Local Order	Alex Fallin, Nathaniel Gorski, Tripti Agarwal, Bei Wang, Ganesh Gopalakrishnan, Martin Burtscher	2026-03-27	下载	Many scientific codes and instruments generate large amounts of floating-point data at high rates that must be compressed before they can be stored.
Efficiently Reproducing Distributed Workflows in Notebook-based Systems	Talha Azaz, Raza Ahmad, Md Saiful Islam, Douglas Thain, Tanu Malik	2026-03-27	下载	Notebooks provide an author-friendly environment for iterative development, modular execution, and easy sharing. Distributed workflows are increasingly being authored and executed in notebooks, yet sh...
Hardware-Agnostic and Insightful Efficiency Metrics for Accelerated Systems: Definition and Implementation within TALP	Ghazal Rahimi, Victor Lopez, Marc Clascà, Joan Vinyals Ylla Català, Jesus Labarta, Marta Garcia-Gasulla	2026-03-27	下载	The increasing adoption of heterogeneous platforms that combine CPUs with accelerators such as GPUs in high-performance computing (HPC) introduces new challenges for performance analysis and optimizat...
Rocks, Pebbles and Sand: Modality-aware Scheduling for Multimodal Large Language Model Inference	Konstantinos Papaioannou, Thaleia Dimitra Doudali	2026-03-27	下载	Multimodal Large Language Models (MLLMs) power platforms like ChatGPT, Gemini, and Copilot, enabling richer interactions with text, images, and videos.
UNIFERENCE: A Discrete Event Simulation Framework for Developing Distributed AI Models	Doğaç Eldenk, Stephen Xia	2026-03-27	下载	Developing and evaluating distributed inference algorithms remains difficult due to the lack of standardized tools for modeling heterogeneous devices and networks.
A Lightweight High-Throughput Collective-Capable NoC for Large-Scale ML Accelerators	Luca Colagrande, Lorenzo Leone, Chen Wu, Tim Fischer, Raphael Roth, Luca Benini	2026-03-27	下载	The exponential increase in Machine Learning (ML) model size and complexity has driven unprecedented demand for high-performance acceleration systems.
Wattchmen: Watching the Wattchers -- High Fidelity, Flexible GPU Energy Modeling	Brandon Tran, Matthias Maiterth, Woong Shin, Matthew D. Sinclair, Shivaram Venkataraman	2026-03-27	下载	Modern GPU-rich HPC systems are increasingly becoming energy-constrained. Thus, understanding an application's energy consumption becomes essential.
ParaQAOA: Efficient Parallel Divide-and-Conquer QAOA for Large-Scale Max-Cut Problems Beyond 10,000 Vertices	Po-Hsuan Huang, Xie-Ru Li, Chi Chuang, Chia-Heng Tu, Shih-Hao Hung	2026-03-27	下载	Quantum Approximate Optimization Algorithm (QAOA) has emerged as a promising solution for combinatorial optimization problems using a hybrid quantum-classical framework.
Distributed Quantum Discrete Logarithm Algorithm	Renjie Xu, Daowen Qiu, Ligang Xiao, Le Luo, Xu Zhou	2026-03-27	下载	Solving the discrete logarithm problem (DLP) with quantum computers is a fundamental task with important implications. Beyond Shor's algorithm, many researchers have proposed alternative solutions in ...
DarwinNet: An Evolutionary Network Architecture for Agent-Driven Protocol Synthesis	Jinliang Xu, Bingqi Li	2026-03-27	下载	Traditional network architectures suffer from severe protocol ossification and structural fragility due to their reliance on static, human-defined rules that fail to adapt to the emergent edge cases a...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
ML-Enabled Open RAN: A Comprehensive Survey of Architectures, Challenges, and Opportunities	Mira Chandra Kirana, Patatchona Keyela, Fatemeh Rostamian, Deemah H. Tashman, Soumaya Cherkaoui	2026-03-27	下载	As wireless communication systems become more advanced, Open Radio Access Networks (O-RAN) stand out as a notable framework that promotes interoperability and cost-effectiveness.
Trustworthy AI-Driven Dynamic Hybrid RIS: Joint Optimization and Reward Poisoning-Resilient Control in Cognitive MISO Networks	Deemah H. Tashman, Soumaya Cherkaoui	2026-03-27	下载	Cognitive radio networks (CRNs) are a key mechanism for alleviating spectrum scarcity by enabling secondary users (SUs) to opportunistically access licensed frequency bands without harmful interferenc...
Innovation Discovery System for Networking Research	Mengrui Zhang, Bang Huang, Yunxin Xu, Haiying Huang, Luxi Zhao, Mochun Long, Qingyu Song, Qiao Xiang, Xue Liu, Jiwu Shu	2026-03-27	下载	As networking systems become increasingly complex, achieving disruptive innovation grows more challenging. At the same time, recent progress in Large Language Models (LLMs) has shown strong potential ...
DarwinNet: An Evolutionary Network Architecture for Agent-Driven Protocol Synthesis	Jinliang Xu, Bingqi Li	2026-03-27	下载	Traditional network architectures suffer from severe protocol ossification and structural fragility due to their reliance on static, human-defined rules that fail to adapt to the emergent edge cases a...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Hardware-Agnostic and Insightful Efficiency Metrics for Accelerated Systems: Definition and Implementation within TALP	Ghazal Rahimi, Victor Lopez, Marc Clascà, Joan Vinyals Ylla Català, Jesus Labarta, Marta Garcia-Gasulla	2026-03-27	下载	The increasing adoption of heterogeneous platforms that combine CPUs with accelerators such as GPUs in high-performance computing (HPC) introduces new challenges for performance analysis and optimizat...
ParaQAOA: Efficient Parallel Divide-and-Conquer QAOA for Large-Scale Max-Cut Problems Beyond 10,000 Vertices	Po-Hsuan Huang, Xie-Ru Li, Chi Chuang, Chia-Heng Tu, Shih-Hao Hung	2026-03-27	下载	Quantum Approximate Optimization Algorithm (QAOA) has emerged as a promising solution for combinatorial optimization problems using a hybrid quantum-classical framework.
Optimization Trade-offs in Asynchronous Federated Learning: A Stochastic Networks Approach	Abdelkrim Alahyane, Céline Comte, Matthieu Jonckheere	2026-03-27	下载	Synchronous federated learning scales poorly due to the straggler effect. Asynchronous algorithms increase the update throughput by processing updates upon arrival, but they introduce two fundamental ...
Throughput Optimization as a Strategic Lever in Large-Scale AI Systems: Evidence from Dataloader and Memory Profiling Innovations	Mayank Jha	2026-03-27	下载	The development of large-scale foundation models, particularly Large Language Models (LLMs), is constrained by significant computational and memory bottlenecks.