2025-11-21

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Optimized Memory Tagging on AmpereOne Processors	Shiv Kaushik, Mahesh Madhav, Nagi Aboulenein, Jason Bessette, Sandeep Brahmadathan, Ben Chaffin, Matthew Erler, Stephan Jourdan, Thomas Maciukenas, Ramya Masti, Jon Perry, Massimo Sutera, Scott Tetrick, Bret Toll, David Turley, Carl Worth, Atiq Bajwa	2025-11-21	下载	Memory-safety escapes continue to form the launching pad for a wide range of security attacks, especially for the substantial base of deployed software that is coded in pointer-based languages such as...
Pre-cache: A Microarchitectural Solution to prevent Meltdown and Spectre	Subhash Sethumurugan, Hari Cherupalli, Kangjie Lu, John Sartori	2025-11-21	下载	Recent work has shown that out-of-order and speculative execution mechanisms used to increase performance in the majority of processors expose the processors to critical attacks.
End-to-End Transformer Acceleration Through Processing-in-Memory Architectures	Xiaoxuan Yang, Peilin Chen, Tergel Molom-Ochir, Yiran Chen	2025-11-21	下载	Transformers have become central to natural language processing and large language models, but their deployment at scale faces three major challenges.
MemIntelli: A Generic End-to-End Simulation Framework for Memristive Intelligent Computing	Houji Zhou, Ling Yang, Zhiwei Zhou, Yi Li, Xiangshui Miao	2025-11-21	下载	Memristive in-memory computing (IMC) has emerged as a promising solution for addressing the bottleneck in the Von Neumann architecture. However, the couplingbetweenthecircuitandalgorithm in IMC makes ...
DISCA: A Digital In-memory Stochastic Computing Architecture Using A Compressed Bent-Pyramid Format	Shady Agwa, Yikang Shen, Shiwei Wang, Themis Prodromakis	2025-11-21	下载	Nowadays, we are witnessing an Artificial Intelligence revolution that dominates the technology landscape in various application domains, such as healthcare, robotics, automotive, security, and defens...
NX-CGRA: A Programmable Hardware Accelerator for Core Transformer Algorithms on Edge Devices	Rohit Prasad	2025-11-21	下载	The increasing diversity and complexity of transformer workloads at the edge present significant challenges in balancing performance, energy efficiency, and architectural flexibility.
Layer-wise Weight Selection for Power-Efficient Neural Network Acceleration	Jiaxun Fang, Grace Li Zhang, Shaoyi Huang	2025-11-21	下载	Systolic array accelerators execute CNNs with energy dominated by the switching activity of multiply accumulate (MAC) units. Although prior work exploits weight dependent MAC power for compression, ex...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Temperature in SLMs: Impact on Incident Categorization in On-Premises Environments	Marcio Pohlmann, Alex Severo, Gefté Almeida, Diego Kreutz, Tiago Heinrich, Lourenço Pereira	2025-11-21	下载	SOCs and CSIRTs face increasing pressure to automate incident categorization, yet the use of cloud-based LLMs introduces costs, latency, and confidentiality risks.
Urban Buildings Energy Consumption Estimation Using HPC: A Case Study of Bologna	Aldo Canfora, Eleonora Bergamaschi, Riccardo Mioli, Federico Battini, Mirko Degli Esposti, Giorgio Pedrazzi, Chiara Dellacasa	2025-11-21	下载	Urban Building Energy Modeling (UBEM) plays a central role in understanding and forecasting energy consumption at the city scale. In this work, we present a UBEM pipeline that integrates EnergyPlus si...
Formal Specification for Fast ACS: Low-Latency File-Based Ordered Message Delivery at Scale	Sushant Kumar Gupta, Anil Raghunath Iyer, Chang Yu, Neel Bagora, Olivier Pomerleau, Vivek Kumar, Prunthaban Kanthakumar	2025-11-21	下载	Low-latency message delivery is crucial for real-time systems. Data originating from a producer must be delivered to consumers, potentially distributed in clusters across metropolitan and continental ...
Systemic approach for modeling a generic smart grid	Sofiane Ben Amor, Guillaume Guerard, Loup-Noé Levy	2025-11-21	下载	Smart grid technological advances present a recent class of complex interdisciplinary modeling and increasingly difficult simulation problems to solve using traditional computational methods.
Training Foundation Models on a Full-Stack AMD Platform: Compute, Networking, and System Design	Quentin Anthony, Yury Tokpanov, Skyler Szot, Srivatsan Rajagopal, Praneeth Medepalli, Anna Golubeva, Vasu Shyam, Robert Washbourne, Rishi Iyer, Ansh Chaurasia, Tomas Figliolia, Xiao Yang, Abhinav Sarje, Drew Thorstensen, Amartey Pearson, Zack Grossbart, Jason van Patten, Emad Barsoum, Zhenyu Gu, Yao Fu, Beren Millidge	2025-11-21	下载	We report on the first large-scale mixture-of-experts (MoE) pretraining study on pure AMD hardware, utilizing both MI300X GPUs and Pollara networking.
Modeling Anomaly Detection in Cloud Services: Analysis of the Properties that Impact Latency and Resource Consumption	Gabriel Job Antunes Grabher, Fumio Machida, Thomas Ropars	2025-11-21	下载	Detecting and resolving performance anomalies in Cloud services is crucial for maintaining desired performance objectives. Scaling actions triggered by an anomaly detector help achieve target latency ...
SparOA: Sparse and Operator-aware Hybrid Scheduling for Edge DNN Inference	Ziyang Zhang, Jie Liu, Luca Mottola	2025-11-21	下载	The resource demands of deep neural network (DNN) models introduce significant performance challenges, especially when deployed on resource-constrained edge devices.
Optimizing PyTorch Inference with LLM-Based Multi-Agent Systems	Kirill Nagaitsev, Luka Grbcic, Samuel Williams, Costin Iancu	2025-11-21	下载	Maximizing performance on available GPU hardware is an ongoing challenge for modern AI inference systems. Traditional approaches include writing custom GPU kernels and using specialized model compiler...
Fine-grained MoE Load Balancing with Linear Programming	Chenqi Zhao, Wenfei Wu, Linhai Song, Yuchen Xu, Yitao Yuan	2025-11-21	下载	Mixture-of-Experts (MoE) has emerged as a promising approach to scale up deep learning models due to its significant reduction in computational resources.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Group Equivariant Convolutional Networks for Pathloss Estimation	Ziyue Yang, Feng Liu, Yifei Jin, Konstantinos Vandikas	2025-11-21	下载	This paper presents RadioGUNet, a UNet-based deep learning framework for pathloss estimation in wireless communication. Unlike other frameworks, it leverages group equivariant convolutional networks, ...
QoS-Aware Dynamic CU Selection in O-RAN with Graph-Based Reinforcement Learning	Sebastian Racedo, Brigitte Jaumard, Oscar Delgado, Meysam Masoudi	2025-11-21	下载	Open Radio Access Network (O RAN) disaggregates conventional RAN into interoperable components, enabling flexible resource allocation, energy savings, and agile architectural design.
How Feasible are Passive Network Attacks on 5G Networks and Beyond? A Survey	Atmane Ayoub Mansour Bahar, Andrés Alayón Glazunov, Romaric Duvignau	2025-11-21	下载	Privacy concerns around 5G, the latest generation of mobile networks, are growing, with fears that its deployment may increase exposure to privacy risks.
Asymptotic critical transmission radii in random geometry graphs over three-dimensional regions	Jie Ding, Shuai Ma, Xiang Wei, Xiaohua Xu, Xinshan Zhu	2025-11-21	下载	This article presents the precise asymptotical distribution of two types of critical transmission radii, defined in terms of k-connectivity and the minimum vertex degree, for random geometry graphs di...
One Walk is All You Need: Data-Efficient 3D RF Scene Reconstruction with Human Movements	Yiheng Bian, Zechen Li, Lanqing Yang, Hao Pan, Yezhou Wang, Longyuan Ge, Jeffery Wu, Ruiheng Liu, Yongjian Fu, Yichao chen, Guangtao xue	2025-11-21	下载	Reconstructing 3D Radiance Field (RF) scenes through opaque obstacles is a long-standing goal, yet it is fundamentally constrained by a laborious data acquisition process requiring thousands of static...
Adaptive Receiver-Side Scheduling for Smooth Interactive Delivery	Michael Luby	2025-11-21	下载	Interactive applications such as cloud gaming, XR streaming, and real-time inference depend on data objects arriving at a steady cadence. In practice, network delay variation and recovery dynamics at ...
An Introductory Study on the Power Consumption Overhead of ERC-4337 Bundlers	Andrei Arusoaie, Claudiu-Nicu Bărbieru, Oana-Otilia Captarencu, Paul-Flavian Diac, Emanuel Onica, Cosmin-Nicolae Vârlan	2025-11-21	下载	Ethereum is currently the main blockchain ecosystem providing decentralised trust guarantees for applications ranging from finance to e-government.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Temperature in SLMs: Impact on Incident Categorization in On-Premises Environments	Marcio Pohlmann, Alex Severo, Gefté Almeida, Diego Kreutz, Tiago Heinrich, Lourenço Pereira	2025-11-21	下载	SOCs and CSIRTs face increasing pressure to automate incident categorization, yet the use of cloud-based LLMs introduces costs, latency, and confidentiality risks.
DISCA: A Digital In-memory Stochastic Computing Architecture Using A Compressed Bent-Pyramid Format	Shady Agwa, Yikang Shen, Shiwei Wang, Themis Prodromakis	2025-11-21	下载	Nowadays, we are witnessing an Artificial Intelligence revolution that dominates the technology landscape in various application domains, such as healthcare, robotics, automotive, security, and defens...
An Introductory Study on the Power Consumption Overhead of ERC-4337 Bundlers	Andrei Arusoaie, Claudiu-Nicu Bărbieru, Oana-Otilia Captarencu, Paul-Flavian Diac, Emanuel Onica, Cosmin-Nicolae Vârlan	2025-11-21	下载	Ethereum is currently the main blockchain ecosystem providing decentralised trust guarantees for applications ranging from finance to e-government.