2025-09-26

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Enhanced Hybrid Temporal Computing Using Deterministic Summations for Ultra-Low-Power Accelerators	Sachin Sachdeva, Jincong Lu, Wantong Li, Sheldon X. -D. Tan	2025-09-26	下载	This paper presents an accuracy-enhanced Hybrid Temporal Computing (E-HTC) framework for ultra-low-power hardware accelerators with deterministic additions.
CryptoSRAM: Enabling High-Throughput Cryptography on MCUs via In-SRAM Computing	Jingyao Zhang, Elaheh Sadredini	2025-09-26	下载	Secure communication is a critical requirement for Internet of Things (IoT) devices, which are often based on Microcontroller Units (MCUs). Current cryptographic solutions, which rely on software libr...
No One-Size-Fits-All: A Workload-Driven Characterization of Bit-Parallel vs. Bit-Serial Data Layouts for Processing-using-Memory	Jingyao Zhang, Elaheh Sadredini	2025-09-26	下载	Processing-in-Memory (PIM) is a promising approach to overcoming the memory-wall bottleneck. However, the PIM community has largely treated its two fundamental data layouts, Bit-Parallel (BP) and Bit-...
Latency Based Tiling	Jack Cashman	2025-09-26	下载	Latency Based Tiling provides a systems based approach to deriving approximate tiling solution that maximizes locality while maintaining a fast compile time.
AxLLM: accelerator architecture for large language models with computation reuse capability	Soroush Ahadi, Mehdi Modarressi, Masoud Daneshtalab	2025-09-26	下载	Large language models demand massive computational power and memory resources, posing significant challenges for efficient deployment. While quantization has been widely explored to reduce model size ...
NeuroScalar: A Deep Learning Framework for Fast, Accurate, and In-the-Wild Cycle-Level Performance Prediction	Shayne Wadle, Yanxin Zhang, Vikas Singh, Karthikeyan Sankaralingam	2025-09-26	下载	The evaluation of new microprocessor designs is constrained by slow, cycle-accurate simulators that rely on unrepresentative benchmark traces.
SAHM: State-Aware Heterogeneous Multicore for Single-Thread Performance	Shayne Wadle, Karthikeyan Sankaralingam	2025-09-26	下载	Improving single-thread performance remains a critical challenge in modern processor design, as conventional approaches such as deeper speculation, wider pipelines, and complex out-of-order execution ...
Privacy-Preserving Performance Profiling of In-The-Wild GPUs	Ian McDougall, Michael Davies, Rahul Chatterjee, Somesh Jha, Karthikeyan Sankaralingam	2025-09-26	下载	GPUs are the dominant platform for many important applications today including deep learning, accelerated computing, and scientific simulation.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
OptimES: Optimizing Federated Learning Using Remote Embeddings for Graph Neural Networks	Pranjal Naman, Yogesh Simmhan	2025-09-26	下载	Graph Neural Networks (GNNs) have experienced rapid advancements in recent years due to their ability to learn meaningful representations from graph data structures.
Ringleader ASGD: The First Asynchronous SGD with Optimal Time Complexity under Data Heterogeneity	Artavazd Maranjyan, Peter Richtárik	2025-09-26	下载	Asynchronous stochastic gradient methods are central to scalable distributed optimization, particularly when devices differ in computational capabilities.
Efficient Fine-Grained GPU Performance Modeling for Distributed Deep Learning of LLM	Biyao Zhang, Mingkai Zheng, Debargha Ganguly, Xuecen Zhang, Vikash Singh, Vipin Chaudhary, Zhao Zhang	2025-09-26	下载	Training Large Language Models(LLMs) is one of the most compute-intensive tasks in high-performance computing. Predicting end-to-end training time for multi-billion parameter models distributed across...
Agora: Bridging the GPU Cloud Resource-Price Disconnect	Ian McDougall, Noah Scott, Joon Huh, Kirthevasan Kandasamy, Karthikeyan Sankaralingam	2025-09-26	下载	The historic trend of Moore's Law, which predicted exponential growth in computational performance per dollar, has diverged for modern Graphics Processing Units (GPUs).
Role-Aware Multi-modal federated learning system for detecting phishing webpages	Bo Wang, Imran Khan, Martin White, Natalia Beloff	2025-09-26	下载	We present a federated, multi-modal phishing website detector that supports URL, HTML, and IMAGE inputs without binding clients to a fixed modality at inference: any client can invoke any modality hea...
Orientation does not help with 3-coloring a grid in online-LOCAL	Thomas Boudier, Filippo Casagrande, Avinandan Das, Massimo Equi, Henrik Lievonen, Augusto Modanese, Ronja Stimpert	2025-09-26	下载	The online-LOCAL and SLOCAL models are extensions of the LOCAL model where nodes are processed in a sequential but potentially adversarial order.
The AI_INFN Platform: Artificial Intelligence Development in the Cloud	Lucio Anderlini, Giulio Bianchini, Diego Ciangottini, Stefano Dal Pra, Diego Michelotto, Rosa Petrini, Daniele Spiga	2025-09-26	下载	Machine Learning (ML) is profoundly reshaping the way researchers create, implement, and operate data-intensive software. Its adoption, however, introduces notable challenges for computing infrastruct...
Code once, Run Green: Automated Green Code Translation in Serverless Computing	Sebastian Werner, Mathis Kähler, Alireza Hakamian	2025-09-26	下载	The rapid digitization and the increasing use of emerging technologies such as AI models have significantly contributed to the emissions of computing infrastructure.
VibeCodeHPC: An Agent-Based Iterative Prompting Auto-Tuner for HPC Code Generation Using LLMs	Shun-ichiro Hayashi, Koki Morita, Daichi Mukunoki, Tetsuya Hoshino, Takahiro Katagiri	2025-09-26	下载	In this study, we propose VibeCodeHPC, a multi-agent system based on large language models (LLMs) for the automatic tuning of high-performance computing (HPC) programs on supercomputers.
Zeppelin: Balancing Variable-length Workloads in Data Parallel Large Model Training	Chang Chen, Tiancheng Chen, Jiangfei Duan, Qianchao Zhu, Zerui Wang, Qinghao Hu, Peng Sun, Xiuhong Li, Chao Yang, Torsten Hoefler	2025-09-26	下载	Training large language models (LLMs) with increasingly long and varying sequence lengths introduces severe load imbalance challenges in large-scale data-parallel training.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Bridging Language Models and Formal Methods for Intent-Driven Optical Network Design	Anis Bekri, Amar Abane, Abdella Battou, Saddek Bensalem	2025-09-26	下载	Intent-Based Networking (IBN) aims to simplify network management by enabling users to specify high-level goals that drive automated network design and configuration.
Bridging Technical Capability and User Accessibility: Off-grid Civilian Emergency Communication	Karim Khamaisi, Oliver Kamer, Bruno Rodrigues, Jan von der Assen, Burkhard Stiller	2025-09-26	下载	During large-scale crises disrupting cellular and Internet infrastructure, civilians lack reliable methods for communication, aid coordination, and access to trustworthy information.
Extreme Value Theory-enhanced Radio Maps for Handovers in Ultra-reliable Communications	Dian Echevarría Pérez, Onel L. Alcaraz López, Hirley Alves	2025-09-26	下载	Efficient handover (HO) strategies are essential for maintaining the stringent performance requirements of ultra-reliable communication (URC) systems.
Red Teaming Quantum-Resistant Cryptographic Standards: A Penetration Testing Framework Integrating AI and Quantum Security	Petar Radanliev	2025-09-26	下载	This study presents a structured approach to evaluating vulnerabilities within quantum cryptographic protocols, focusing on the BB84 quantum key distribution method and National Institute of Standards...
LLM Agent Communication Protocol (LACP) Requires Urgent Standardization: A Telecom-Inspired Protocol is Necessary	Xin Li, Mengbing Liu, Chau Yuen	2025-09-26	下载	This position paper argues that the field of LLM agents requires a unified, telecom-inspired communication protocol to ensure safety, interoperability, and scalability, especially within the context o...
Evaluating Open-Source Large Language Models for Technical Telecom Question Answering	Arina Caraus, Alessio Buscemi, Sumit Kumar, Ion Turcanu	2025-09-26	下载	Large Language Models (LLMs) have shown remarkable capabilities across various fields. However, their performance in technical domains such as telecommunications remains underexplored.
Leveraging Wireless Sensor Networks for Real-Time Monitoring and Control of Industrial Environments	Muhammad Junaid Asif, Abdul Rehman, Asim Mehmood, Rana Fayyaz Ahmad, Shazia Saqib	2025-09-26	下载	This research proposes an extensive technique for monitoring and controlling the industrial parameters using Internet of Things (IoT) technology based on wireless communication.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Secure and Efficient Access Control for Computer-Use Agents via Context Space	Haochen Gong, Chenxiao Li, Rui Chang, Wenbo Shen	2025-09-26	下载	Large language model (LLM)-based computer-use agents represent a convergence of AI and OS capabilities, enabling natural language to control system- and application-level functions.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Tiny-QMoE	Jack Cashman, Jiaqi Nie	2025-09-26	下载	The QMoE model provides a practical approach for compression of massive Mixture-of-Experts (MoE) models. QMoE offers a solution geared towards memory limitations that often reach terabyte scales, and ...
Latency Based Tiling	Jack Cashman	2025-09-26	下载	Latency Based Tiling provides a systems based approach to deriving approximate tiling solution that maximizes locality while maintaining a fast compile time.
SAHM: State-Aware Heterogeneous Multicore for Single-Thread Performance	Shayne Wadle, Karthikeyan Sankaralingam	2025-09-26	下载	Improving single-thread performance remains a critical challenge in modern processor design, as conventional approaches such as deeper speculation, wider pipelines, and complex out-of-order execution ...
Less is More: Faster Maximum Clique Search by Work-Avoidance	Hans Vandierendonck	2025-09-26	下载	The maximum clique (MC) problem is a challenging graph mining problem which, due to its NP-hard nature, can take a substantial amount of execution time.
Light Differentiable Logic Gate Networks	Lukas Rüttgers, Till Aczel, Andreas Plesner, Roger Wattenhofer	2025-09-26	下载	Differentiable logic gate networks (DLGNs) exhibit extraordinary efficiency at inference while sustaining competitive accuracy. But vanishing gradients, discretization errors, and high training cost i...