Skip to content

2025-10-02

cs.AR - Architecture

标题作者发布日期PDF摘要
In-memory Training on Analog Devices with Limited Conductance States via Multi-tile Residual LearningJindan Li, Zhaoxian Wu, Gaowen Liu, Tayfun Gokmen, Tianyi Chen2025-10-02下载Analog in-memory computing (AIMC) accelerators enable efficient deep neural network computation directly within memory using resistive crossbar arrays, where model parameters are represented by the co...
Rigorous Evaluation of Microarchitectural Side-Channels with Statistical Model CheckingWeihang Li, Pete Crowley, Arya Tschand, Yu Wang, Miroslav Pajic, Daniel Sorin2025-10-02下载Rigorous quantitative evaluation of microarchitectural side channels is challenging for two reasons. First, the processors, attacks, and defenses often exhibit probabilistic behaviors.
Multiplier-free In-Memory Vector-Matrix Multiplication Using Distributed ArithmeticFelix Zeller, John Reuben, Dietmar Fey2025-10-02下载Vector-Matrix Multiplication (VMM) is the fundamental and frequently required computation in inference of Neural Networks (NN). Due to the large data movement required during inference, VMM can benefi...
Edge GPU Aware Multiple AI Model Pipeline for Accelerated MRI Reconstruction and AnalysisAshiyana Abdul Majeed, Mahmoud Meribout, Safa Mohammed Sali2025-10-02下载Advancements in AI have greatly enhanced the medical imaging process, making it quicker to diagnose patients. However, very few have investigated the optimization of a multi-model system with hardware...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
ElasticMoE: An Efficient Auto Scaling Method for Mixture-of-Experts ModelsGursimran Singh, Timothy Yu, Haley Li, Cheng Chen, Hanieh Sadri, Qintao Zhang, Yu Zhang, Ying Xiong, Yong Zhang, Zhenan Fan2025-10-02下载Mixture-of-Experts (MoE) models promise efficient scaling of large language models (LLMs) by activating only a small subset of experts per token, but their parallelized inference pipelines make elasti...
Accuracy vs Performance: An abstraction model for deadline constrained offloading at the mobile-edgeJamie Cotter, Ignacio Castineiras, Victor Cionca2025-10-02下载In this paper, we present a solution for low-latency deadline-constrained DNN offloading on mobile edge devices. We design a scheduling algorithm with lightweight network state representation, conside...
Percepta: High Performance Stream Processing at the EdgeClarisse Sousa, Tiago Fonseca, Luis Lino Ferreira, Ricardo Venâncio, Ricardo Severino2025-10-02下载The rise of real-time data and the proliferation of Internet of Things (IoT) devices have highlighted the limitations of cloud-centric solutions, particularly regarding latency, bandwidth, and privacy...
Exponential Quantum Advantage for Message Complexity in Distributed AlgorithmsFrançois Le Gall, Maël Luce, Joseph Marchand, Mathieu Roget2025-10-02下载We investigate how much quantum distributed algorithms can outperform classical distributed algorithms with respect to the message complexity (the overall amount of communication used by the algorithm...
Semantic-Aware Scheduling for GPU Clusters with Large Language ModelsZerui Wang, Qinghao Hu, Ana Klimovic, Tianwei Zhang, Yonggang Wen, Peng Sun, Dahua Lin2025-10-02下载Deep learning (DL) schedulers are pivotal in optimizing resource allocation in GPU clusters, but operate with a critical limitation: they are largely blind to the semantic context of the jobs they man...
TetriServe: Efficient DiT Serving for Heterogeneous Image GenerationRunyu Lu, Shiqi He, Wenxuan Tan, Shenggui Li, Ruofan Wu, Jeff J. Ma, Ang Chen, Mosharaf Chowdhury2025-10-02下载Diffusion Transformer (DiT) models excel at generating high-quality images through iterative denoising steps, but serving them under strict Service Level Objectives (SLOs) is challenging due to their ...
Efficient Tree-Structured Deep Research with Adaptive Resource AllocationLunyiu Nie, Nedim Lipka, Ryan A. Rossi, Swarat Chaudhuri2025-10-02下载Deep research agents, which synthesize information across diverse sources, are significantly constrained by the sequential nature of reasoning.
QScale: Probabilistic Chained Consensus for Moderate-Scale SystemsHasan Heydari, Alysson Bessani, Kartik Nayak2025-10-02下载Existing distributed ledger protocols either incur a high communication complexity and are thus suited to systems with a small number of processes (e.g.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
AURA: Adaptive Unified Reasoning and Automation with LLM-Guided MARL for NextG Cellular NetworksNarjes Nourzad, Mingyu Zong, Bhaskar Krishnamachari2025-10-02下载Next-generation (NextG) cellular networks are expected to manage dynamic traffic while sustaining high performance. Large language models (LLMs) provide strategic reasoning for 6G planning, but their ...
TLoRa: Implementing TLS Over LoRa for Secure HTTP Communication in IoTAtonu Ghosh, Akhilesh Mohanasundaram, Srishivanth R F, Sudip Misra2025-10-02下载We present TLoRa, an end-to-end architecture for HTTPS communication over LoRa by integrating TCP tunneling and a complete TLS 1.3 handshake. It enables a seamless and secure communication channel bet...
Interplay between Security, Privacy and Trust in 6G-enabled Intelligent Transportation SystemsAhmed Danladi Abdullahi, Erfan Bahrami, Tooska Dargahi, Mohammed Al-Khalidi, Mohammad Hammoudeh2025-10-02下载The advancement of 6G technology has the potential to revolutionize the transportation sector and significantly improve how we travel. 6G-enabled Intelligent Transportation Systems (ITS) promise to of...
How to Combat Reactive and Dynamic Jamming Attacks with Reinforcement LearningYalin E. Sagduyu, Tugba Erpek, Kemal Davaslioglu, Sastry Kompella2025-10-02下载This paper studies the problem of mitigating reactive jamming, where a jammer adopts a dynamic policy of selecting channels and sensing thresholds to detect and jam ongoing transmissions.
Causal Intervention Sequence Analysis for Fault Tracking in Radio Access NetworksChenhua Shi, Joji Philip, Subhadip Bandyopadhyay, Jayanta Choudhury2025-10-02下载To keep modern Radio Access Networks (RAN) running smoothly, operators need to spot the real-world triggers behind Service-Level Agreement (SLA) breaches well before customers feel them.
Accuracy vs Performance: An abstraction model for deadline constrained offloading at the mobile-edgeJamie Cotter, Ignacio Castineiras, Victor Cionca2025-10-02下载In this paper, we present a solution for low-latency deadline-constrained DNN offloading on mobile edge devices. We design a scheduling algorithm with lightweight network state representation, conside...
ReSSFormer: A Recursive Sparse Structured Transformer for Scalable and Long-Context ReasoningHaochen You, Baojing Liu2025-10-02下载While Transformer architectures have demonstrated impressive scalability across domains, they continue to face challenges in long-context reasoning, computational efficiency, and structural generaliza...
MMGaP: Multi-User MIMO Detection and Precoding using GPU-assisted Physics-inspired ComputationAbhishek Kumar Singh, Kyle Jamieson2025-10-02下载Physics-inspired and quantum compute based methods for processing in the physical layer of next-generation cellular radio access networks have demonstrated theoretical advances in spectral efficiency ...

基于 VitePress 构建