2025-10-02

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
In-memory Training on Analog Devices with Limited Conductance States via Multi-tile Residual Learning	Jindan Li, Zhaoxian Wu, Gaowen Liu, Tayfun Gokmen, Tianyi Chen	2025-10-02	下载	Analog in-memory computing (AIMC) accelerators enable efficient deep neural network computation directly within memory using resistive crossbar arrays, where model parameters are represented by the co...
Rigorous Evaluation of Microarchitectural Side-Channels with Statistical Model Checking	Weihang Li, Pete Crowley, Arya Tschand, Yu Wang, Miroslav Pajic, Daniel Sorin	2025-10-02	下载	Rigorous quantitative evaluation of microarchitectural side channels is challenging for two reasons. First, the processors, attacks, and defenses often exhibit probabilistic behaviors.
Multiplier-free In-Memory Vector-Matrix Multiplication Using Distributed Arithmetic	Felix Zeller, John Reuben, Dietmar Fey	2025-10-02	下载	Vector-Matrix Multiplication (VMM) is the fundamental and frequently required computation in inference of Neural Networks (NN). Due to the large data movement required during inference, VMM can benefi...
Edge GPU Aware Multiple AI Model Pipeline for Accelerated MRI Reconstruction and Analysis	Ashiyana Abdul Majeed, Mahmoud Meribout, Safa Mohammed Sali	2025-10-02	下载	Advancements in AI have greatly enhanced the medical imaging process, making it quicker to diagnose patients. However, very few have investigated the optimization of a multi-model system with hardware...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
ElasticMoE: An Efficient Auto Scaling Method for Mixture-of-Experts Models	Gursimran Singh, Timothy Yu, Haley Li, Cheng Chen, Hanieh Sadri, Qintao Zhang, Yu Zhang, Ying Xiong, Yong Zhang, Zhenan Fan	2025-10-02	下载	Mixture-of-Experts (MoE) models promise efficient scaling of large language models (LLMs) by activating only a small subset of experts per token, but their parallelized inference pipelines make elasti...
Accuracy vs Performance: An abstraction model for deadline constrained offloading at the mobile-edge	Jamie Cotter, Ignacio Castineiras, Victor Cionca	2025-10-02	下载	In this paper, we present a solution for low-latency deadline-constrained DNN offloading on mobile edge devices. We design a scheduling algorithm with lightweight network state representation, conside...
Percepta: High Performance Stream Processing at the Edge	Clarisse Sousa, Tiago Fonseca, Luis Lino Ferreira, Ricardo Venâncio, Ricardo Severino	2025-10-02	下载	The rise of real-time data and the proliferation of Internet of Things (IoT) devices have highlighted the limitations of cloud-centric solutions, particularly regarding latency, bandwidth, and privacy...
Exponential Quantum Advantage for Message Complexity in Distributed Algorithms	François Le Gall, Maël Luce, Joseph Marchand, Mathieu Roget	2025-10-02	下载	We investigate how much quantum distributed algorithms can outperform classical distributed algorithms with respect to the message complexity (the overall amount of communication used by the algorithm...
Semantic-Aware Scheduling for GPU Clusters with Large Language Models	Zerui Wang, Qinghao Hu, Ana Klimovic, Tianwei Zhang, Yonggang Wen, Peng Sun, Dahua Lin	2025-10-02	下载	Deep learning (DL) schedulers are pivotal in optimizing resource allocation in GPU clusters, but operate with a critical limitation: they are largely blind to the semantic context of the jobs they man...
TetriServe: Efficient DiT Serving for Heterogeneous Image Generation	Runyu Lu, Shiqi He, Wenxuan Tan, Shenggui Li, Ruofan Wu, Jeff J. Ma, Ang Chen, Mosharaf Chowdhury	2025-10-02	下载	Diffusion Transformer (DiT) models excel at generating high-quality images through iterative denoising steps, but serving them under strict Service Level Objectives (SLOs) is challenging due to their ...
Efficient Tree-Structured Deep Research with Adaptive Resource Allocation	Lunyiu Nie, Nedim Lipka, Ryan A. Rossi, Swarat Chaudhuri	2025-10-02	下载	Deep research agents, which synthesize information across diverse sources, are significantly constrained by the sequential nature of reasoning.
QScale: Probabilistic Chained Consensus for Moderate-Scale Systems	Hasan Heydari, Alysson Bessani, Kartik Nayak	2025-10-02	下载	Existing distributed ledger protocols either incur a high communication complexity and are thus suited to systems with a small number of processes (e.g.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
AURA: Adaptive Unified Reasoning and Automation with LLM-Guided MARL for NextG Cellular Networks	Narjes Nourzad, Mingyu Zong, Bhaskar Krishnamachari	2025-10-02	下载	Next-generation (NextG) cellular networks are expected to manage dynamic traffic while sustaining high performance. Large language models (LLMs) provide strategic reasoning for 6G planning, but their ...
TLoRa: Implementing TLS Over LoRa for Secure HTTP Communication in IoT	Atonu Ghosh, Akhilesh Mohanasundaram, Srishivanth R F, Sudip Misra	2025-10-02	下载	We present TLoRa, an end-to-end architecture for HTTPS communication over LoRa by integrating TCP tunneling and a complete TLS 1.3 handshake. It enables a seamless and secure communication channel bet...
Interplay between Security, Privacy and Trust in 6G-enabled Intelligent Transportation Systems	Ahmed Danladi Abdullahi, Erfan Bahrami, Tooska Dargahi, Mohammed Al-Khalidi, Mohammad Hammoudeh	2025-10-02	下载	The advancement of 6G technology has the potential to revolutionize the transportation sector and significantly improve how we travel. 6G-enabled Intelligent Transportation Systems (ITS) promise to of...
How to Combat Reactive and Dynamic Jamming Attacks with Reinforcement Learning	Yalin E. Sagduyu, Tugba Erpek, Kemal Davaslioglu, Sastry Kompella	2025-10-02	下载	This paper studies the problem of mitigating reactive jamming, where a jammer adopts a dynamic policy of selecting channels and sensing thresholds to detect and jam ongoing transmissions.
Causal Intervention Sequence Analysis for Fault Tracking in Radio Access Networks	Chenhua Shi, Joji Philip, Subhadip Bandyopadhyay, Jayanta Choudhury	2025-10-02	下载	To keep modern Radio Access Networks (RAN) running smoothly, operators need to spot the real-world triggers behind Service-Level Agreement (SLA) breaches well before customers feel them.
Accuracy vs Performance: An abstraction model for deadline constrained offloading at the mobile-edge	Jamie Cotter, Ignacio Castineiras, Victor Cionca	2025-10-02	下载	In this paper, we present a solution for low-latency deadline-constrained DNN offloading on mobile edge devices. We design a scheduling algorithm with lightweight network state representation, conside...
ReSSFormer: A Recursive Sparse Structured Transformer for Scalable and Long-Context Reasoning	Haochen You, Baojing Liu	2025-10-02	下载	While Transformer architectures have demonstrated impressive scalability across domains, they continue to face challenges in long-context reasoning, computational efficiency, and structural generaliza...
MMGaP: Multi-User MIMO Detection and Precoding using GPU-assisted Physics-inspired Computation	Abhishek Kumar Singh, Kyle Jamieson	2025-10-02	下载	Physics-inspired and quantum compute based methods for processing in the physical layer of next-generation cellular radio access networks have demonstrated theoretical advances in spectral efficiency ...