Skip to content

2025-11-21

cs.AR - Architecture

标题作者发布日期PDF摘要
Optimized Memory Tagging on AmpereOne ProcessorsShiv Kaushik, Mahesh Madhav, Nagi Aboulenein, Jason Bessette, Sandeep Brahmadathan, Ben Chaffin, Matthew Erler, Stephan Jourdan, Thomas Maciukenas, Ramya Masti, Jon Perry, Massimo Sutera, Scott Tetrick, Bret Toll, David Turley, Carl Worth, Atiq Bajwa2025-11-21下载Memory-safety escapes continue to form the launching pad for a wide range of security attacks, especially for the substantial base of deployed software that is coded in pointer-based languages such as...
Pre-cache: A Microarchitectural Solution to prevent Meltdown and SpectreSubhash Sethumurugan, Hari Cherupalli, Kangjie Lu, John Sartori2025-11-21下载Recent work has shown that out-of-order and speculative execution mechanisms used to increase performance in the majority of processors expose the processors to critical attacks.
End-to-End Transformer Acceleration Through Processing-in-Memory ArchitecturesXiaoxuan Yang, Peilin Chen, Tergel Molom-Ochir, Yiran Chen2025-11-21下载Transformers have become central to natural language processing and large language models, but their deployment at scale faces three major challenges.
MemIntelli: A Generic End-to-End Simulation Framework for Memristive Intelligent ComputingHouji Zhou, Ling Yang, Zhiwei Zhou, Yi Li, Xiangshui Miao2025-11-21下载Memristive in-memory computing (IMC) has emerged as a promising solution for addressing the bottleneck in the Von Neumann architecture. However, the couplingbetweenthecircuitandalgorithm in IMC makes ...
DISCA: A Digital In-memory Stochastic Computing Architecture Using A Compressed Bent-Pyramid FormatShady Agwa, Yikang Shen, Shiwei Wang, Themis Prodromakis2025-11-21下载Nowadays, we are witnessing an Artificial Intelligence revolution that dominates the technology landscape in various application domains, such as healthcare, robotics, automotive, security, and defens...
NX-CGRA: A Programmable Hardware Accelerator for Core Transformer Algorithms on Edge DevicesRohit Prasad2025-11-21下载The increasing diversity and complexity of transformer workloads at the edge present significant challenges in balancing performance, energy efficiency, and architectural flexibility.
Layer-wise Weight Selection for Power-Efficient Neural Network AccelerationJiaxun Fang, Grace Li Zhang, Shaoyi Huang2025-11-21下载Systolic array accelerators execute CNNs with energy dominated by the switching activity of multiply accumulate (MAC) units. Although prior work exploits weight dependent MAC power for compression, ex...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Temperature in SLMs: Impact on Incident Categorization in On-Premises EnvironmentsMarcio Pohlmann, Alex Severo, Gefté Almeida, Diego Kreutz, Tiago Heinrich, Lourenço Pereira2025-11-21下载SOCs and CSIRTs face increasing pressure to automate incident categorization, yet the use of cloud-based LLMs introduces costs, latency, and confidentiality risks.
Urban Buildings Energy Consumption Estimation Using HPC: A Case Study of BolognaAldo Canfora, Eleonora Bergamaschi, Riccardo Mioli, Federico Battini, Mirko Degli Esposti, Giorgio Pedrazzi, Chiara Dellacasa2025-11-21下载Urban Building Energy Modeling (UBEM) plays a central role in understanding and forecasting energy consumption at the city scale. In this work, we present a UBEM pipeline that integrates EnergyPlus si...
Formal Specification for Fast ACS: Low-Latency File-Based Ordered Message Delivery at ScaleSushant Kumar Gupta, Anil Raghunath Iyer, Chang Yu, Neel Bagora, Olivier Pomerleau, Vivek Kumar, Prunthaban Kanthakumar2025-11-21下载Low-latency message delivery is crucial for real-time systems. Data originating from a producer must be delivered to consumers, potentially distributed in clusters across metropolitan and continental ...
Systemic approach for modeling a generic smart gridSofiane Ben Amor, Guillaume Guerard, Loup-Noé Levy2025-11-21下载Smart grid technological advances present a recent class of complex interdisciplinary modeling and increasingly difficult simulation problems to solve using traditional computational methods.
Training Foundation Models on a Full-Stack AMD Platform: Compute, Networking, and System DesignQuentin Anthony, Yury Tokpanov, Skyler Szot, Srivatsan Rajagopal, Praneeth Medepalli, Anna Golubeva, Vasu Shyam, Robert Washbourne, Rishi Iyer, Ansh Chaurasia, Tomas Figliolia, Xiao Yang, Abhinav Sarje, Drew Thorstensen, Amartey Pearson, Zack Grossbart, Jason van Patten, Emad Barsoum, Zhenyu Gu, Yao Fu, Beren Millidge2025-11-21下载We report on the first large-scale mixture-of-experts (MoE) pretraining study on pure AMD hardware, utilizing both MI300X GPUs and Pollara networking.
Modeling Anomaly Detection in Cloud Services: Analysis of the Properties that Impact Latency and Resource ConsumptionGabriel Job Antunes Grabher, Fumio Machida, Thomas Ropars2025-11-21下载Detecting and resolving performance anomalies in Cloud services is crucial for maintaining desired performance objectives. Scaling actions triggered by an anomaly detector help achieve target latency ...
SparOA: Sparse and Operator-aware Hybrid Scheduling for Edge DNN InferenceZiyang Zhang, Jie Liu, Luca Mottola2025-11-21下载The resource demands of deep neural network (DNN) models introduce significant performance challenges, especially when deployed on resource-constrained edge devices.
Optimizing PyTorch Inference with LLM-Based Multi-Agent SystemsKirill Nagaitsev, Luka Grbcic, Samuel Williams, Costin Iancu2025-11-21下载Maximizing performance on available GPU hardware is an ongoing challenge for modern AI inference systems. Traditional approaches include writing custom GPU kernels and using specialized model compiler...
Fine-grained MoE Load Balancing with Linear ProgrammingChenqi Zhao, Wenfei Wu, Linhai Song, Yuchen Xu, Yitao Yuan2025-11-21下载Mixture-of-Experts (MoE) has emerged as a promising approach to scale up deep learning models due to its significant reduction in computational resources.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Group Equivariant Convolutional Networks for Pathloss EstimationZiyue Yang, Feng Liu, Yifei Jin, Konstantinos Vandikas2025-11-21下载This paper presents RadioGUNet, a UNet-based deep learning framework for pathloss estimation in wireless communication. Unlike other frameworks, it leverages group equivariant convolutional networks, ...
QoS-Aware Dynamic CU Selection in O-RAN with Graph-Based Reinforcement LearningSebastian Racedo, Brigitte Jaumard, Oscar Delgado, Meysam Masoudi2025-11-21下载Open Radio Access Network (O RAN) disaggregates conventional RAN into interoperable components, enabling flexible resource allocation, energy savings, and agile architectural design.
How Feasible are Passive Network Attacks on 5G Networks and Beyond? A SurveyAtmane Ayoub Mansour Bahar, Andrés Alayón Glazunov, Romaric Duvignau2025-11-21下载Privacy concerns around 5G, the latest generation of mobile networks, are growing, with fears that its deployment may increase exposure to privacy risks.
Asymptotic critical transmission radii in random geometry graphs over three-dimensional regionsJie Ding, Shuai Ma, Xiang Wei, Xiaohua Xu, Xinshan Zhu2025-11-21下载This article presents the precise asymptotical distribution of two types of critical transmission radii, defined in terms of k-connectivity and the minimum vertex degree, for random geometry graphs di...
One Walk is All You Need: Data-Efficient 3D RF Scene Reconstruction with Human MovementsYiheng Bian, Zechen Li, Lanqing Yang, Hao Pan, Yezhou Wang, Longyuan Ge, Jeffery Wu, Ruiheng Liu, Yongjian Fu, Yichao chen, Guangtao xue2025-11-21下载Reconstructing 3D Radiance Field (RF) scenes through opaque obstacles is a long-standing goal, yet it is fundamentally constrained by a laborious data acquisition process requiring thousands of static...
Adaptive Receiver-Side Scheduling for Smooth Interactive DeliveryMichael Luby2025-11-21下载Interactive applications such as cloud gaming, XR streaming, and real-time inference depend on data objects arriving at a steady cadence. In practice, network delay variation and recovery dynamics at ...
An Introductory Study on the Power Consumption Overhead of ERC-4337 BundlersAndrei Arusoaie, Claudiu-Nicu Bărbieru, Oana-Otilia Captarencu, Paul-Flavian Diac, Emanuel Onica, Cosmin-Nicolae Vârlan2025-11-21下载Ethereum is currently the main blockchain ecosystem providing decentralised trust guarantees for applications ranging from finance to e-government.

cs.PF - Performance

标题作者发布日期PDF摘要
Temperature in SLMs: Impact on Incident Categorization in On-Premises EnvironmentsMarcio Pohlmann, Alex Severo, Gefté Almeida, Diego Kreutz, Tiago Heinrich, Lourenço Pereira2025-11-21下载SOCs and CSIRTs face increasing pressure to automate incident categorization, yet the use of cloud-based LLMs introduces costs, latency, and confidentiality risks.
DISCA: A Digital In-memory Stochastic Computing Architecture Using A Compressed Bent-Pyramid FormatShady Agwa, Yikang Shen, Shiwei Wang, Themis Prodromakis2025-11-21下载Nowadays, we are witnessing an Artificial Intelligence revolution that dominates the technology landscape in various application domains, such as healthcare, robotics, automotive, security, and defens...
An Introductory Study on the Power Consumption Overhead of ERC-4337 BundlersAndrei Arusoaie, Claudiu-Nicu Bărbieru, Oana-Otilia Captarencu, Paul-Flavian Diac, Emanuel Onica, Cosmin-Nicolae Vârlan2025-11-21下载Ethereum is currently the main blockchain ecosystem providing decentralised trust guarantees for applications ranging from finance to e-government.

基于 VitePress 构建