2025-08-11

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
JSPIM: A Skew-Aware PIM Accelerator for High-Performance Databases Join and Select Operations	Sabiha Tajdari, Anastasia Ailamaki, Sandhya Dwarkadas	2025-08-11	下载	Database applications are increasingly bottlenecked by memory bandwidth and latency due to the memory wall and the limited scalability of DRAM.
Vector-Centric Machine Learning Systems: A Cross-Stack Approach	Wenqi Jiang	2025-08-11	下载	Today, two major trends are shaping the evolution of ML systems. First, modern AI systems are becoming increasingly complex, often integrating components beyond the model itself.
Architecting Long-Context LLM Acceleration with Packing-Prefetch Scheduler and Ultra-Large Capacity On-Chip Memories	Ming-Yen Lee, Faaiq Waqar, Hanchen Yang, Muhammed Ahosan Ul Karim, Harsono Simka, Shimeng Yu	2025-08-11	下载	Long-context Large Language Model (LLM) inference faces increasing compute bottlenecks as attention calculations scale with context length, primarily due to the growing KV-cache transfer overhead that...
Profiling Concurrent Vision Inference Workloads on NVIDIA Jetson -- Extended	Abhinaba Chakraborty, Wouter Tavernier, Akis Kourtis, Mario Pickavet, Andreas Oikonomakis, Didier Colle	2025-08-11	下载	The proliferation of IoT devices and advancements in network technologies have intensified the demand for real-time data processing at the network edge.
XDMA: A Distributed, Extensible DMA Architecture for Layout-Flexible Data Movements in Heterogeneous Multi-Accelerator SoCs	Fanchen Kong, Yunhao Deng, Xiaoling Yi, Ryan Antonio, Marian Verhelst	2025-08-11	下载	As modern AI workloads increasingly rely on heterogeneous accelerators, ensuring high-bandwidth and layout-flexible data movements between accelerator memories has become a pressing challenge.
ELF: Efficient Logic Synthesis by Pruning Redundancy in Refactoring	Dimitris Tsaras, Xing Li, Lei Chen, Zhiyao Xie, Mingxuan Yuan	2025-08-11	下载	In electronic design automation, logic optimization operators play a crucial role in minimizing the gate count of logic circuits. However, their computation demands are high.
TLV-HGNN: Thinking Like a Vertex for Memory-efficient HGNN Inference	Dengke Han, Duo Wang, Mingyu Yan, Xiaochun Ye, Dongrui Fan	2025-08-11	下载	Heterogeneous graph neural networks (HGNNs) excel at processing heterogeneous graph data and are widely applied in critical domains. In HGNN inference, the neighbor aggregation stage is the primary pe...
ARISE: Automating RISC-V Instruction Set Extension	Andreas Hager-Clukas, Philipp van Kempen, Stefan Wallentowitz	2025-08-11	下载	RISC-V is an extendable Instruction Set Architecture, growing in popularity for embedded systems. However, optimizing it to specific requirements, imposes a great deal of manual effort.
A Matrix Decomposition Method for Odd-Type Gaussian Normal Basis Multiplication	Kittiphon Phalakarn, Athasit Surarerks	2025-08-11	下载	Normal basis is used in many applications because of the efficiency of the implementation. However, most space complexity reduction techniques for binary field multiplier are applicable for only optim...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
A Reinforcement Learning-Driven Task Scheduling Algorithm for Multi-Tenant Distributed Systems	Xiaopei Zhang, Xingang Wang, Xin Wang	2025-08-11	下载	This paper addresses key challenges in task scheduling for multi-tenant distributed systems, including dynamic resource variation, heterogeneous tenant demands, and fairness assurance.
Benchmarking Federated Learning for Throughput Prediction in 5G Live Streaming Applications	Yuvraj Dutta, Soumyajit Chatterjee, Sandip Chakraborty, Basabdatta Palit	2025-08-11	下载	Accurate and adaptive network throughput prediction is essential for latency-sensitive and bandwidth-intensive applications in 5G and emerging 6G networks.
Vector-Centric Machine Learning Systems: A Cross-Stack Approach	Wenqi Jiang	2025-08-11	下载	Today, two major trends are shaping the evolution of ML systems. First, modern AI systems are becoming increasingly complex, often integrating components beyond the model itself.
Towards Efficient and Practical GPU Multitasking in the Era of LLM	Jiarong Xing, Yifan Qiao, Simon Mo, Xingqi Cui, Gur-Eyal Sela, Yang Zhou, Joseph Gonzalez, Ion Stoica	2025-08-11	下载	GPU singletasking is becoming increasingly inefficient and unsustainable as hardware capabilities grow and workloads diversify. We are now at an inflection point where GPUs must embrace multitasking, ...
Extremely Scalable Distributed Computation of Contour Trees via Pre-Simplification	Mingzhe Li, Hamish Carr, Oliver Rübel, Bei Wang, Gunther H. Weber	2025-08-11	下载	Contour trees offer an abstract representation of the level set topology in scalar fields and are widely used in topological data analysis and visualization.
Profiling Concurrent Vision Inference Workloads on NVIDIA Jetson -- Extended	Abhinaba Chakraborty, Wouter Tavernier, Akis Kourtis, Mario Pickavet, Andreas Oikonomakis, Didier Colle	2025-08-11	下载	The proliferation of IoT devices and advancements in network technologies have intensified the demand for real-time data processing at the network edge.
XDMA: A Distributed, Extensible DMA Architecture for Layout-Flexible Data Movements in Heterogeneous Multi-Accelerator SoCs	Fanchen Kong, Yunhao Deng, Xiaoling Yi, Ryan Antonio, Marian Verhelst	2025-08-11	下载	As modern AI workloads increasingly rely on heterogeneous accelerators, ensuring high-bandwidth and layout-flexible data movements between accelerator memories has become a pressing challenge.
Fully-Fluctuating Participation in Sleepy Consensus	Yuval Efron, Joachim Neu, Toniann Pitassi	2025-08-11	下载	Proof-of-work allows Bitcoin to boast security amidst arbitrary fluctuations in participation of miners throughout time, so long as, at any point in time, a majority of hash power is honest.
On the Operational Resilience of CBDC: Threats and Prospects of Formal Validation for Offline Payments	Marco Bernardo, Federico Calandra, Andrea Esposito, Francesco Fabris	2025-08-11	下载	Information and communication technologies are by now employed in most human activities, including economics and finance. Modern computers have reached an extraordinary power in terms of information p...
Optimizing Federated Learning for Scalable Power-demand Forecasting in Microgrids	Roopkatha Banerjee, Sampath Koti, Gyanendra Singh, Anirban Chakraborty, Gurunath Gurrala, Bhushan Jagyasi, Yogesh Simmhan	2025-08-11	下载	Real-time monitoring of power consumption in cities and micro-grids through the Internet of Things (IoT) can help forecast future demand and optimize grid operations.
Performance Evaluation of Brokerless Messaging Libraries	Lorenzo La Corte, Syed Aftab Rashid, Andrei-Marian Dan	2025-08-11	下载	Messaging systems are essential for efficiently transferring large volumes of data, ensuring rapid response times and high-throughput communication.
GPU-Accelerated Syndrome Decoding for Quantum LDPC Codes below the 63 μs Latency Threshold	Oscar Ferraz, Bruno Coutinho, Gabriel Falcao, Marco Gomes, Francisco A. Monteiro, Vitor Silva	2025-08-11	下载	This paper presents a GPU-accelerated decoder for quantum low-density parity-check (QLDPC) codes that achieves sub- $63$ μs latency, below the surface code decoder's real-time threshold demonstrated ...
EFU: Enforcing Federated Unlearning via Functional Encryption	Samaneh Mohammadi, Vasileios Tsouvalas, Iraklis Symeonidis, Ali Balador, Tanir Ozcelebi, Francesco Flammini, Nirvana Meratnia	2025-08-11	下载	Federated unlearning (FU) algorithms allow clients in federated settings to exercise their ''right to be forgotten'' by removing the influence of their data from a collaboratively trained model.
Towards Lock Modularization for Heterogeneous Environments	Hanze Zhang, Rong Chen, Haibo Chen	2025-08-11	下载	Modern hardware environments are becoming increasingly heterogeneous, leading to the emergence of applications specifically designed to exploit this heterogeneity.
Over-the-Top Resource Broker System for Split Computing: An Approach to Distribute Cloud Computing Infrastructure	Ingo Friese, Jochen Klaffer, Mandy Galkow-Schneider, Sergiy Melnyk, Qiuheng Zhou, Hans Dieter Schotten	2025-08-11	下载	6G network architectures will usher in a wave of innovative services and capabilities, introducing concepts like split computing and dynamic processing nodes.
Perpetual exploration in anonymous synchronous networks with a Byzantine black hole	Adri Bhattacharya, Pritam Goswami, Evangelos Bampas, Partha Sarathi Mandal	2025-08-11	下载	In this paper, we investigate: ``How can a group of initially co-located mobile agents perpetually explore an unknown graph, when one stationary node occasionally behaves maliciously, under an adversa...
Multi-Hop Privacy Propagation for Differentially Private Federated Learning in Social Networks	Chenchen Lin, Xuehe Wang	2025-08-11	下载	Federated learning (FL) enables collaborative model training across decentralized clients without sharing local data, thereby enhancing privacy and facilitating collaboration among clients connected v...
Taming Cold Starts: Proactive Serverless Scheduling with Model Predictive Control	Chanh Nguyen, Monowar Bhuyan, Erik Elmroth	2025-08-11	下载	Serverless computing has transformed cloud application deployment by introducing a fine-grained, event-driven execution model that abstracts away infrastructure management.
Coordinated Power Management on Heterogeneous Systems	Zhong Zheng, Zhiling Lan, Xingfu Wu, Valerie E. Taylor, Michael E. Papka	2025-08-11	下载	Performance prediction is essential for energy-efficient computing in heterogeneous computing systems that integrate CPUs and GPUs. However, traditional performance modeling methods often rely on exha...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Experimental Validation of Provably Covert Communication Using Software-Defined Radio	Rohan Bali, Trevor E. Bailey, Michael S. Bullock, Boulat A. Bash	2025-08-11	下载	The fundamental information-theoretic limits of covert, or low probability of detection/intercept (LPD/LPI), communication have been extensively studied for over a decade, resulting in the square root...
Industrial Viewpoints on RAN Technologies for 6G	Mansoor Shafi, Erik G. Larsson, Xingqin Lin, Dorin Panaitopol, Stefan Parkvall, Flavien Ronteix-Jacquet, Antti Toskala	2025-08-11	下载	6G standardization is to start imminently, with commercial deployments expected before 2030. Its technical components and performance requirements are the focus of this article.
VeriPHY: Physical Layer Signal Authentication for Wireless Communication in 5G Environments	Clifton Paul Robinson, Salvatore D'Oro, Tommaso Melodia	2025-08-11	下载	Physical layer authentication (PLA) uses inherent characteristics of the communication medium to provide secure and efficient authentication in wireless networks, bypassing the need for traditional cr...
Adaptive Multiple Access and Service Placement for Generative Diffusion Models	Hamidreza Mazandarani, Mohammad Farhoudi, Masoud Shokrnezhad, Tarik Taleb	2025-08-11	下载	Generative Diffusion Models (GDMs) have emerged as key components of Generative Artificial Intelligence (GenAI), offering unparalleled expressiveness and controllability for complex data generation ta...
Performance Evaluation of Brokerless Messaging Libraries	Lorenzo La Corte, Syed Aftab Rashid, Andrei-Marian Dan	2025-08-11	下载	Messaging systems are essential for efficiently transferring large volumes of data, ensuring rapid response times and high-throughput communication.
Scalable and Energy-Efficient Predictive Data Collection in Wireless Sensor Networks with Constructive Interference	Conor Muldoon	2025-08-11	下载	A new class of Wireless Sensor Network has emerged whereby multiple nodes transmit data simultaneously, exploiting constructive interference to enable data collection frameworks with low energy usage ...
An Experimental Reservoir-Augmented Foundation Model: 6G O-RAN Case Study	Farhad Rezazadeh, Raymond Zhao, Jiongyu Dai, Amir Ashtari Gargari, Hatim Chergui, Lingjia Liu	2025-08-11	下载	Next-generation open radio access networks (O-RAN) continuously stream tens of key performance indicators (KPIs) together with raw in-phase/quadrature (IQ) samples, yielding ultra-high-dimensional, no...
Over-the-Top Resource Broker System for Split Computing: An Approach to Distribute Cloud Computing Infrastructure	Ingo Friese, Jochen Klaffer, Mandy Galkow-Schneider, Sergiy Melnyk, Qiuheng Zhou, Hans Dieter Schotten	2025-08-11	下载	6G network architectures will usher in a wave of innovative services and capabilities, introducing concepts like split computing and dynamic processing nodes.
Joint link scheduling and power allocation in imperfect and energy-constrained underwater wireless sensor networks	Tong Zhang, Yu Gou, Jun Liu, Shanshan Song, Tingting Yang, Jun-Hong Cui	2025-08-11	下载	Underwater wireless sensor networks (UWSNs) stand as promising technologies facilitating diverse underwater applications. However, the major design issues of the considered system are the severely lim...
Joint Scheduling and Resource Allocation in mmWave IAB Networks Using Deep RL	Maryam Abbasalizadeh, Sashank Narain	2025-08-11	下载	Integrated Access and Backhaul (IAB) is critical for dense 5G and beyond deployments, especially in mmWave bands where fiber backhaul is infeasible.
Optimization of Private Semantic Communication Performance: An Uncooperative Covert Communication Method	Wenjing Zhang, Ye Hu, Tao Luo, Zhilong Zhang, Mingzhe Chen	2025-08-11	下载	In this paper, a novel covert semantic communication framework is investigated. Within this framework, a server extracts and transmits the semantic information, i.e.
Achieving Fair-Effective Communications and Robustness in Underwater Acoustic Sensor Networks: A Semi-Cooperative Approach	Yu Gou, Tong Zhang, Jun Liu, Tingting Yang, Shanshan Song, Jun-Hong Cui	2025-08-11	下载	This paper investigates the fair-effective communication and robustness in imperfect and energy-constrained underwater acoustic sensor networks (IC-UASNs).
Enhancing Mega-Satellite Networks with Generative Semantic Communication: A Networking Perspective	Binquan Guo, Wanting Yang, Zehui Xiong, Zhou Zhang, Baosheng Li, Zhu Han, Rahim Tafazolli, Tony Q. S. Quek	2025-08-11	下载	The advance of direct satellite-to-device communication has positioned mega-satellite constellations as a cornerstone of 6G wireless communication, enabling seamless global connectivity even in remote...
Multimodal Remote Inference	Keyuan Zhang, Yin Sun, Bo Ji	2025-08-11	下载	We consider a remote inference system with multiple modalities, where a multimodal machine learning (ML) model performs real-time inference using features collected from remote sensors.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Towards Efficient and Practical GPU Multitasking in the Era of LLM	Jiarong Xing, Yifan Qiao, Simon Mo, Xingqi Cui, Gur-Eyal Sela, Yang Zhou, Joseph Gonzalez, Ion Stoica	2025-08-11	下载	GPU singletasking is becoming increasingly inefficient and unsustainable as hardware capabilities grow and workloads diversify. We are now at an inflection point where GPUs must embrace multitasking, ...
Selective KV-Cache Sharing to Mitigate Timing Side-Channels in LLM Inference	Kexin Chu, Zecheng Lin, Dawei Xiang, Zixu Shen, Jianchang Su, Cheng Chu, Yiwei Yang, Wenhui Zhang, Wenfei Wu, Wei Zhang	2025-08-11	下载	Global KV-cache sharing is an effective optimization for accelerating large language model (LLM) inference, yet it introduces an API-visible timing side channel that lets adversaries infer sensitive u...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
JSPIM: A Skew-Aware PIM Accelerator for High-Performance Databases Join and Select Operations	Sabiha Tajdari, Anastasia Ailamaki, Sandhya Dwarkadas	2025-08-11	下载	Database applications are increasingly bottlenecked by memory bandwidth and latency due to the memory wall and the limited scalability of DRAM.
Vector-Centric Machine Learning Systems: A Cross-Stack Approach	Wenqi Jiang	2025-08-11	下载	Today, two major trends are shaping the evolution of ML systems. First, modern AI systems are becoming increasingly complex, often integrating components beyond the model itself.
Profiling Concurrent Vision Inference Workloads on NVIDIA Jetson -- Extended	Abhinaba Chakraborty, Wouter Tavernier, Akis Kourtis, Mario Pickavet, Andreas Oikonomakis, Didier Colle	2025-08-11	下载	The proliferation of IoT devices and advancements in network technologies have intensified the demand for real-time data processing at the network edge.
A Data-driven ML Approach for Maximizing Performance in LLM-Adapter Serving	Ferran Agullo, Joan Oliveras, Chen Wang, Alberto Gutierrez-Torre, Olivier Tardieu, Alaa Youssef, Jordi Torres, Josep Ll. Berral	2025-08-11	下载	With the rapid adoption of Large Language Models (LLMs), LLM-adapters have become increasingly common, providing lightweight specialization of large-scale models.
Taming Cold Starts: Proactive Serverless Scheduling with Model Predictive Control	Chanh Nguyen, Monowar Bhuyan, Erik Elmroth	2025-08-11	下载	Serverless computing has transformed cloud application deployment by introducing a fine-grained, event-driven execution model that abstracts away infrastructure management.