2025-01-16

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Managed-Retention Memory: A New Class of Memory for the AI Era	Sergey Legtchenko, Ioan Stefanovici, Richard Black, Antony Rowstron, Junyi Liu, Paolo Costa, Burcu Canakci, Dushyanth Narayanan, Xingbo Wu	2025-01-16	下载	AI clusters today are one of the major uses of High Bandwidth Memory (HBM). However, HBM is suboptimal for AI workloads for several reasons. Analysis shows HBM is overprovisioned on write performance,...
Atleus: Accelerating Transformers on the Edge Enabled by 3D Heterogeneous Manycore Architectures	Pratyush Dhingra, Janardhan Rao Doppa, Partha Pratim Pande	2025-01-16	下载	Transformer architectures have become the standard neural network model for various machine learning applications including natural language processing and computer vision.
MOGNET: A Mux-residual quantized Network leveraging Online-Generated weights	Van Thien Nguyen, William Guicquero, Gilles Sicard	2025-01-16	下载	This paper presents a compact model architecture called MOGNET, compatible with a resource-limited hardware. MOGNET uses a streamlined Convolutional factorization block based on a combination of 2 poi...
Holistic Optimization Framework for FPGA Accelerators	Stéphane Pouget, Michael Lo, Louis-Noël Pouchet, Jason Cong	2025-01-16	下载	Customized accelerators have revolutionized modern computing by delivering substantial gains in energy efficiency and performance through hardware specialization.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Serverless Computing: Architecture, Concepts, and Applications	Mohsen Ghorbian, Mostafa Ghobaei-Arani	2025-01-16	下载	Recently, serverless computing has gained recognition as a leading cloud computing method. Providing a solution that does not require direct server and infrastructure management, this technology has a...
The Streaming Batch Model for Efficient and Fault-Tolerant Heterogeneous Execution	Frank Sifei Luan, Ron Yifeng Wang, Yile Gu, Ziming Mao, Charlotte Lin, Amog Kamsetty, Hao Chen, Cheng Su, Balaji Veeramani, Scott Lee, SangBin Cho, Clark Zinzow, Eric Liang, Ion Stoica, Stephanie Wang	2025-01-16	下载	While ML model training and inference are both GPU-intensive, CPU-based data processing is often the bottleneck. Distributed data processing systems based on the batch or stream processing models assu...
Managed-Retention Memory: A New Class of Memory for the AI Era	Sergey Legtchenko, Ioan Stefanovici, Richard Black, Antony Rowstron, Junyi Liu, Paolo Costa, Burcu Canakci, Dushyanth Narayanan, Xingbo Wu	2025-01-16	下载	AI clusters today are one of the major uses of High Bandwidth Memory (HBM). However, HBM is suboptimal for AI workloads for several reasons. Analysis shows HBM is overprovisioned on write performance,...
Cloud abstractions for AI workloads	Marco Canini, Theophilus A. Benson, Ricardo Bianchini, Íñigo Goiri, Dejan Kostić, Peter Pietzuch, Simon Peter	2025-01-16	下载	AI workloads, often hosted in multi-tenant cloud environments, require vast computational resources but suffer inefficiencies due to limited tenant-provider coordination.
Core Hours and Carbon Credits: Incentivizing Sustainability in HPC	Alok Kamatar, Maxime Gonthier, Valerie Hayot-Sasson, Andre Bauer, Marcin Copik, Torsten Hoefler, Raul Castro Fernandez, Kyle Chard, Ian Foster	2025-01-16	下载	Realizing a shared responsibility between providers and consumers is critical to manage the sustainability of HPC. However, while cost may motivate efficiency improvements by infrastructure operators,...
RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection	Jianrui Shi, Yong Zhao, Zeyang Cui, Xiaoming Shen, Minhang Zeng, Xiaojie Liu	2025-01-16	下载	Object detection plays a crucial role in smart video analysis, with applications ranging from autonomous driving and security to smart cities.
Boosting Performance of Iterative Applications on GPUs: Kernel Batching with CUDA Graphs	Jonah Ekelund, Stefano Markidis, Ivy Peng	2025-01-16	下载	Graphics Processing Units (GPUs) have become the standard in accelerating scientific applications on heterogeneous systems. However, as GPUs are getting faster, one potential performance bottleneck wi...
PICE: A Semantic-Driven Progressive Inference System for LLM Serving in Cloud-Edge Networks	Huiyou Zhan, Xuan Zhang, Haisheng Tan, Han Tian, Dongping Yong, Junyang Zhang, Xiang-Yang Li	2025-01-16	下载	Large language models (LLMs), while driving a new wave of interactive AI applications across numerous domains, suffer from high inference costs and heavy cloud dependency.
Jodes: Efficient Oblivious Join in the Distributed Setting	Yilei Wang, Xiangdong Zeng, Sheng Wang, Feifei Li	2025-01-16	下载	Trusted execution environment (TEE) has provided an isolated and secure environment for building cloud-based analytic systems, but it still suffers from access pattern leakages caused by side-channel ...
PATCHEDSERVE: A Patch Management Framework for SLO-Optimized Hybrid Resolution Diffusion Serving	Desen Sun, Zepeng Zhao, Yuke Wang	2025-01-16	下载	The Text-to-Image (T2I) diffusion model has emerged as one of the most widely adopted generative models. However, serving diffusion models at the granularity of entire images introduces significant ch...
Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores	Haisha Zhao, San Li, Jiaheng Wang, Chunbao Zhou, Jue Wang, Zhikuang Xin, Shunde Li, Zhiqiang Liang, Zhijie Pan, Fang Liu, Yan Zeng, Yangang Wang, Xuebin Chi	2025-01-16	下载	General-purpose Sparse Matrix-Matrix Multiplication (SpMM) is a fundamental kernel in scientific computing and deep learning. The emergence of new matrix computation units such as Tensor Cores (TCs) b...
Split Fine-Tuning for Large Language Models in Wireless Networks	Songge Zhang, Guoliang Cheng, Xinyu Huang, Zuguang Li, Wen Wu, Lingyang Song, Xuemin Shen	2025-01-16	下载	Fine-tuning is the process of adapting the pre-trained large language models (LLMs) for downstream tasks. Due to substantial parameters, fine-tuning LLMs on mobile devices demands considerable memory ...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Complex-Valued Neural Networks for Ultra-Reliable Massive MIMO	Pedro Benevenuto Valadares, Jonathan Aguiar Soares, Kayol Mayer, Dalton Soares Arantes	2025-01-16	下载	In the evolving landscape of 5G and 6G networks, the demands extend beyond high data rates, ultra-low latency, and extensive coverage, increasingly emphasizing the need for reliability.
pFedWN: A Personalized Federated Learning Framework for D2D Wireless Networks with Heterogeneous Data	Zhou Ni, Masoud Ghazikor, Morteza Hashemi	2025-01-16	下载	Traditional Federated Learning (FL) approaches often struggle with data heterogeneity across clients, leading to suboptimal model performance for individual clients.
Ruling the Unruly: Designing Effective, Low-Noise Network Intrusion Detection Rules for Security Operations Centers	Koen T. W. Teuwen, Tom Mulders, Emmanuele Zambon, Luca Allodi	2025-01-16	下载	Many Security Operations Centers (SOCs) today still heavily rely on signature-based Network Intrusion Detection Systems (NIDS) such as Suricata.
Parallel multi-objective metaheuristics for smart communications in vehicular networks	Jamal Toutouh, Enrique Alba	2025-01-16	下载	This article analyzes the use of two parallel multi-objective soft computing algorithms to automatically search for high-quality settings of the Ad hoc On Demand Vector routing protocol for vehicular ...
Intelligent OLSR Routing Protocol Optimization for VANETs	Jamal Toutouh, José García-Nieto, Enrique Alba	2025-01-16	下载	Recent advances in wireless technologies have given rise to the emergence of vehicular ad hoc networks (VANETs). In such networks, the limited coverage of WiFi and the high mobility of the nodes gener...
Authenticated Delegation and Authorized AI Agents	Tobin South, Samuele Marro, Thomas Hardjono, Robert Mahari, Cedric Deslandes Whitney, Dazza Greenwood, Alan Chan, Alex Pentland	2025-01-16	下载	The rapid deployment of autonomous AI agents creates urgent challenges around authorization, accountability, and access control in digital spaces.
HpC: A Calculus for Hybrid and Mobile Systems -- Full Version	Xiong Xu, Jean-Pierre Talpin, Shuling Wang, Hao Wu, Bohua Zhan, Xinxin Liu, Naijun Zhan	2025-01-16	下载	Networked cybernetic and physical systems of the Internet of Things (IoT) immerse civilian and industrial infrastructures into an interconnected and dynamic web of hybrid and mobile devices.
MoE $^2$ : Optimizing Collaborative Inference for Edge Large Language Models	Lyudong Jin, Yanning Zhang, Yanhan Li, Shurong Wang, Howard H. Yang, Jian Wu, Meng Zhang	2025-01-16	下载	Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of natural language processing tasks. Exploiting the heterogeneous capabilities of edge LLMs is crucial for d...
Artificial Intelligence, Ambient Backscatter Communication and Non-Terrestrial Networks: A 6G Commixture	Muhammad Ali Jamshed, Bushra Haq, Muhammad Ahmed Mohsin, Ali Nauman, Halim Yanikomeroglu	2025-01-16	下载	The advent of Non-Terrestrial Networks (NTN) represents a compelling response to the International Mobile Telecommunications 2030 (IMT-2030) framework, enabling the delivery of advanced, seamless conn...
Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse	Guangyuan Liu, Hongyang Du, Jiacheng Wang, Dusit Niyato, Dong In Kim	2025-01-16	下载	The rapid advancement of immersive technologies has propelled the development of the Metaverse, where the convergence of virtual and physical realities necessitates the generation of high-quality, pho...
Adaptive Contextual Caching for Mobile Edge Large Language Model Service	Guangyuan Liu, Yinqiu Liu, Jiacheng Wang, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong	2025-01-16	下载	Mobile edge Large Language Model (LLM) deployments face inherent constraints, such as limited computational resources and network bandwidth. Although Retrieval-Augmented Generation (RAG) mitigates som...
MagnetDB: A Longitudinal Torrent Discovery Dataset with IMDb-Matched Movies and TV Shows	Scott Seidenberger, Noah Pursell, Anindya Maiti	2025-01-16	下载	BitTorrent remains a prominent channel for illicit distribution of copyrighted material, yet the supply side of such content remains understudied.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Quantum-Enhanced Transformers for Robust Acoustic Scene Classification in IoT Environments	Minh K. Quan, Mayuri Wijayasundara, Sujeeva Setunge, Pubudu N. Pathirana	2025-01-16	下载	The proliferation of Internet of Things (IoT) devices equipped with acoustic sensors necessitates robust acoustic scene classification (ASC) capabilities, even in noisy and data-limited environments.
PixelBrax: Learning Continuous Control from Pixels End-to-End on the GPU	Trevor McInroe, Samuel Garcin	2025-01-16	下载	We present PixelBrax, a set of continuous control tasks with pixel observations. We combine the Brax physics engine with a pure JAX renderer, allowing reinforcement learning (RL) experiments to run en...