Appearance
2026-03-27
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Efficient CMOS Invertible Logic Using Stochastic Computing | Sean C. Smithson, Naoya Onizawa, Brett H. Meyer, Warren J. Gross, Takahiro Hanyu | 2026-03-27 | 下载 | Invertible logic can operate in one of two modes: 1) a forward mode, in which inputs are presented and a single, correct output is produced, and 2) a reverse mode, in which the output is fixed and the... |
| Who Checks the Checker? Enhancing Component-level Architectural SEU Fault Tolerance for End-to-End SoC Protection | Michael Rogenmoser, Philippe Sauter, Chen Wu, Angelo Garofalo, Luca Benini | 2026-03-27 | 下载 | Single-event upset (SEU) fault tolerance for systems-on-chip (SoCs) in radiation-heavy environments is often addressed by architectural fault-tolerance approaches protecting individual SoC components ... |
| A Lightweight High-Throughput Collective-Capable NoC for Large-Scale ML Accelerators | Luca Colagrande, Lorenzo Leone, Chen Wu, Tim Fischer, Raphael Roth, Luca Benini | 2026-03-27 | 下载 | The exponential increase in Machine Learning (ML) model size and complexity has driven unprecedented demand for high-performance acceleration systems. |
| Wattchmen: Watching the Wattchers -- High Fidelity, Flexible GPU Energy Modeling | Brandon Tran, Matthias Maiterth, Woong Shin, Matthew D. Sinclair, Shivaram Venkataraman | 2026-03-27 | 下载 | Modern GPU-rich HPC systems are increasingly becoming energy-constrained. Thus, understanding an application's energy consumption becomes essential. |
| VolTune: A Fine-Grained Runtime Voltage Control Architecture for FPGA Systems | Akram Ben Ahmed, Takahiro Hirofuchi, Takaaki Fukai | 2026-03-27 | 下载 | The rapid emergence of edge computing platforms and large-scale data centers has made power efficiency a primary design constraint, particularly for data-intensive and AI-driven workloads. |
| IBEX: Internal Bandwidth-Efficient Compression Architecture for Scalable CXL Memory Expansion | Younghoon Ko, Hyemin Park, Hyuk-Jae Lee, Hyokeun Lee | 2026-03-27 | 下载 | As the memory channel count is confined by physical dimensions, memory expanders appear to be a promising approach to extending memory capacity and channels by augmenting the existing I/O interface (e... |
| RAGnaroX: A Secure, Local-Hosted ChatOps Assistant Using Small Language Models | Benedikt Dornauer, Mircea-Cristian Racasan | 2026-03-27 | 下载 | This paper introduces RAGnaroX, a resource-efficient ChatOps assistant that operates entirely on commodity hardware. Unlike existing solutions that often rely on external providers such as Azure or Op... |
| Per-Bank Memory Bandwidth Regulation for Predictable and Performant Real-Time System | Connor Rudy Sullivan, Amin Mamandipoor, Cole Ridge Strickler, Heechul Yun | 2026-03-27 | 下载 | Modern multicore system-on-chips (SoCs) share off-chip DRAM across cores, where bank-level interference can significantly degrade performance and threaten real-time guarantees. |
| Data Gravity and the Energy Limits of Computation | Wonsuk Lee, Jehoshua Bruck | 2026-03-27 | 下载 | Unlike the von Neumann architecture, which separates computation from memory, the brain tightly integrates them, an organization that large language models increasingly resemble. |
| VeRA+: Vector-Based Lightweight Digital Compensation for Drift-Resilient RRAM In-Memory Computing | Weirong Dong, Kai Zhou, Zhen Kong, Zhengke Yang, Quan Cheng, Haoyuan Li, Junkai Huang, Jun Lan, Yida Li, Masanori Hashimoto, Longyang Lin | 2026-03-27 | 下载 | RRAM-based in-memory computing (IMC) offers high energy efficiency but suffers from conductance drift that severely degrades long-term accuracy. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| HFIPay: Privacy-Preserving, Cross-Chain Cryptocurrency Payments to Human-Friendly Identifiers | Jian Sheng Wang | 2026-03-27 | 下载 | Sending cryptocurrency to an email address or phone number should be as simple as a bank transfer, yet naive schemes that map identifiers directly to blockchain addresses expose the recipient's balanc... |
| Fast Topology-Aware Lossy Data Compression with Full Preservation of Critical Points and Local Order | Alex Fallin, Nathaniel Gorski, Tripti Agarwal, Bei Wang, Ganesh Gopalakrishnan, Martin Burtscher | 2026-03-27 | 下载 | Many scientific codes and instruments generate large amounts of floating-point data at high rates that must be compressed before they can be stored. |
| Efficiently Reproducing Distributed Workflows in Notebook-based Systems | Talha Azaz, Raza Ahmad, Md Saiful Islam, Douglas Thain, Tanu Malik | 2026-03-27 | 下载 | Notebooks provide an author-friendly environment for iterative development, modular execution, and easy sharing. Distributed workflows are increasingly being authored and executed in notebooks, yet sh... |
| Hardware-Agnostic and Insightful Efficiency Metrics for Accelerated Systems: Definition and Implementation within TALP | Ghazal Rahimi, Victor Lopez, Marc Clascà, Joan Vinyals Ylla Català, Jesus Labarta, Marta Garcia-Gasulla | 2026-03-27 | 下载 | The increasing adoption of heterogeneous platforms that combine CPUs with accelerators such as GPUs in high-performance computing (HPC) introduces new challenges for performance analysis and optimizat... |
| Rocks, Pebbles and Sand: Modality-aware Scheduling for Multimodal Large Language Model Inference | Konstantinos Papaioannou, Thaleia Dimitra Doudali | 2026-03-27 | 下载 | Multimodal Large Language Models (MLLMs) power platforms like ChatGPT, Gemini, and Copilot, enabling richer interactions with text, images, and videos. |
| UNIFERENCE: A Discrete Event Simulation Framework for Developing Distributed AI Models | Doğaç Eldenk, Stephen Xia | 2026-03-27 | 下载 | Developing and evaluating distributed inference algorithms remains difficult due to the lack of standardized tools for modeling heterogeneous devices and networks. |
| A Lightweight High-Throughput Collective-Capable NoC for Large-Scale ML Accelerators | Luca Colagrande, Lorenzo Leone, Chen Wu, Tim Fischer, Raphael Roth, Luca Benini | 2026-03-27 | 下载 | The exponential increase in Machine Learning (ML) model size and complexity has driven unprecedented demand for high-performance acceleration systems. |
| Wattchmen: Watching the Wattchers -- High Fidelity, Flexible GPU Energy Modeling | Brandon Tran, Matthias Maiterth, Woong Shin, Matthew D. Sinclair, Shivaram Venkataraman | 2026-03-27 | 下载 | Modern GPU-rich HPC systems are increasingly becoming energy-constrained. Thus, understanding an application's energy consumption becomes essential. |
| ParaQAOA: Efficient Parallel Divide-and-Conquer QAOA for Large-Scale Max-Cut Problems Beyond 10,000 Vertices | Po-Hsuan Huang, Xie-Ru Li, Chi Chuang, Chia-Heng Tu, Shih-Hao Hung | 2026-03-27 | 下载 | Quantum Approximate Optimization Algorithm (QAOA) has emerged as a promising solution for combinatorial optimization problems using a hybrid quantum-classical framework. |
| Distributed Quantum Discrete Logarithm Algorithm | Renjie Xu, Daowen Qiu, Ligang Xiao, Le Luo, Xu Zhou | 2026-03-27 | 下载 | Solving the discrete logarithm problem (DLP) with quantum computers is a fundamental task with important implications. Beyond Shor's algorithm, many researchers have proposed alternative solutions in ... |
| DarwinNet: An Evolutionary Network Architecture for Agent-Driven Protocol Synthesis | Jinliang Xu, Bingqi Li | 2026-03-27 | 下载 | Traditional network architectures suffer from severe protocol ossification and structural fragility due to their reliance on static, human-defined rules that fail to adapt to the emergent edge cases a... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ML-Enabled Open RAN: A Comprehensive Survey of Architectures, Challenges, and Opportunities | Mira Chandra Kirana, Patatchona Keyela, Fatemeh Rostamian, Deemah H. Tashman, Soumaya Cherkaoui | 2026-03-27 | 下载 | As wireless communication systems become more advanced, Open Radio Access Networks (O-RAN) stand out as a notable framework that promotes interoperability and cost-effectiveness. |
| Trustworthy AI-Driven Dynamic Hybrid RIS: Joint Optimization and Reward Poisoning-Resilient Control in Cognitive MISO Networks | Deemah H. Tashman, Soumaya Cherkaoui | 2026-03-27 | 下载 | Cognitive radio networks (CRNs) are a key mechanism for alleviating spectrum scarcity by enabling secondary users (SUs) to opportunistically access licensed frequency bands without harmful interferenc... |
| Innovation Discovery System for Networking Research | Mengrui Zhang, Bang Huang, Yunxin Xu, Haiying Huang, Luxi Zhao, Mochun Long, Qingyu Song, Qiao Xiang, Xue Liu, Jiwu Shu | 2026-03-27 | 下载 | As networking systems become increasingly complex, achieving disruptive innovation grows more challenging. At the same time, recent progress in Large Language Models (LLMs) has shown strong potential ... |
| DarwinNet: An Evolutionary Network Architecture for Agent-Driven Protocol Synthesis | Jinliang Xu, Bingqi Li | 2026-03-27 | 下载 | Traditional network architectures suffer from severe protocol ossification and structural fragility due to their reliance on static, human-defined rules that fail to adapt to the emergent edge cases a... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Hardware-Agnostic and Insightful Efficiency Metrics for Accelerated Systems: Definition and Implementation within TALP | Ghazal Rahimi, Victor Lopez, Marc Clascà, Joan Vinyals Ylla Català, Jesus Labarta, Marta Garcia-Gasulla | 2026-03-27 | 下载 | The increasing adoption of heterogeneous platforms that combine CPUs with accelerators such as GPUs in high-performance computing (HPC) introduces new challenges for performance analysis and optimizat... |
| ParaQAOA: Efficient Parallel Divide-and-Conquer QAOA for Large-Scale Max-Cut Problems Beyond 10,000 Vertices | Po-Hsuan Huang, Xie-Ru Li, Chi Chuang, Chia-Heng Tu, Shih-Hao Hung | 2026-03-27 | 下载 | Quantum Approximate Optimization Algorithm (QAOA) has emerged as a promising solution for combinatorial optimization problems using a hybrid quantum-classical framework. |
| Optimization Trade-offs in Asynchronous Federated Learning: A Stochastic Networks Approach | Abdelkrim Alahyane, Céline Comte, Matthieu Jonckheere | 2026-03-27 | 下载 | Synchronous federated learning scales poorly due to the straggler effect. Asynchronous algorithms increase the update throughput by processing updates upon arrival, but they introduce two fundamental ... |
| Throughput Optimization as a Strategic Lever in Large-Scale AI Systems: Evidence from Dataloader and Memory Profiling Innovations | Mayank Jha | 2026-03-27 | 下载 | The development of large-scale foundation models, particularly Large Language Models (LLMs), is constrained by significant computational and memory bottlenecks. |