Appearance
2025-08-11
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| JSPIM: A Skew-Aware PIM Accelerator for High-Performance Databases Join and Select Operations | Sabiha Tajdari, Anastasia Ailamaki, Sandhya Dwarkadas | 2025-08-11 | 下载 | Database applications are increasingly bottlenecked by memory bandwidth and latency due to the memory wall and the limited scalability of DRAM. |
| Vector-Centric Machine Learning Systems: A Cross-Stack Approach | Wenqi Jiang | 2025-08-11 | 下载 | Today, two major trends are shaping the evolution of ML systems. First, modern AI systems are becoming increasingly complex, often integrating components beyond the model itself. |
| Architecting Long-Context LLM Acceleration with Packing-Prefetch Scheduler and Ultra-Large Capacity On-Chip Memories | Ming-Yen Lee, Faaiq Waqar, Hanchen Yang, Muhammed Ahosan Ul Karim, Harsono Simka, Shimeng Yu | 2025-08-11 | 下载 | Long-context Large Language Model (LLM) inference faces increasing compute bottlenecks as attention calculations scale with context length, primarily due to the growing KV-cache transfer overhead that... |
| Profiling Concurrent Vision Inference Workloads on NVIDIA Jetson -- Extended | Abhinaba Chakraborty, Wouter Tavernier, Akis Kourtis, Mario Pickavet, Andreas Oikonomakis, Didier Colle | 2025-08-11 | 下载 | The proliferation of IoT devices and advancements in network technologies have intensified the demand for real-time data processing at the network edge. |
| XDMA: A Distributed, Extensible DMA Architecture for Layout-Flexible Data Movements in Heterogeneous Multi-Accelerator SoCs | Fanchen Kong, Yunhao Deng, Xiaoling Yi, Ryan Antonio, Marian Verhelst | 2025-08-11 | 下载 | As modern AI workloads increasingly rely on heterogeneous accelerators, ensuring high-bandwidth and layout-flexible data movements between accelerator memories has become a pressing challenge. |
| ELF: Efficient Logic Synthesis by Pruning Redundancy in Refactoring | Dimitris Tsaras, Xing Li, Lei Chen, Zhiyao Xie, Mingxuan Yuan | 2025-08-11 | 下载 | In electronic design automation, logic optimization operators play a crucial role in minimizing the gate count of logic circuits. However, their computation demands are high. |
| TLV-HGNN: Thinking Like a Vertex for Memory-efficient HGNN Inference | Dengke Han, Duo Wang, Mingyu Yan, Xiaochun Ye, Dongrui Fan | 2025-08-11 | 下载 | Heterogeneous graph neural networks (HGNNs) excel at processing heterogeneous graph data and are widely applied in critical domains. In HGNN inference, the neighbor aggregation stage is the primary pe... |
| ARISE: Automating RISC-V Instruction Set Extension | Andreas Hager-Clukas, Philipp van Kempen, Stefan Wallentowitz | 2025-08-11 | 下载 | RISC-V is an extendable Instruction Set Architecture, growing in popularity for embedded systems. However, optimizing it to specific requirements, imposes a great deal of manual effort. |
| A Matrix Decomposition Method for Odd-Type Gaussian Normal Basis Multiplication | Kittiphon Phalakarn, Athasit Surarerks | 2025-08-11 | 下载 | Normal basis is used in many applications because of the efficiency of the implementation. However, most space complexity reduction techniques for binary field multiplier are applicable for only optim... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Reinforcement Learning-Driven Task Scheduling Algorithm for Multi-Tenant Distributed Systems | Xiaopei Zhang, Xingang Wang, Xin Wang | 2025-08-11 | 下载 | This paper addresses key challenges in task scheduling for multi-tenant distributed systems, including dynamic resource variation, heterogeneous tenant demands, and fairness assurance. |
| Benchmarking Federated Learning for Throughput Prediction in 5G Live Streaming Applications | Yuvraj Dutta, Soumyajit Chatterjee, Sandip Chakraborty, Basabdatta Palit | 2025-08-11 | 下载 | Accurate and adaptive network throughput prediction is essential for latency-sensitive and bandwidth-intensive applications in 5G and emerging 6G networks. |
| Vector-Centric Machine Learning Systems: A Cross-Stack Approach | Wenqi Jiang | 2025-08-11 | 下载 | Today, two major trends are shaping the evolution of ML systems. First, modern AI systems are becoming increasingly complex, often integrating components beyond the model itself. |
| Towards Efficient and Practical GPU Multitasking in the Era of LLM | Jiarong Xing, Yifan Qiao, Simon Mo, Xingqi Cui, Gur-Eyal Sela, Yang Zhou, Joseph Gonzalez, Ion Stoica | 2025-08-11 | 下载 | GPU singletasking is becoming increasingly inefficient and unsustainable as hardware capabilities grow and workloads diversify. We are now at an inflection point where GPUs must embrace multitasking, ... |
| Extremely Scalable Distributed Computation of Contour Trees via Pre-Simplification | Mingzhe Li, Hamish Carr, Oliver Rübel, Bei Wang, Gunther H. Weber | 2025-08-11 | 下载 | Contour trees offer an abstract representation of the level set topology in scalar fields and are widely used in topological data analysis and visualization. |
| Profiling Concurrent Vision Inference Workloads on NVIDIA Jetson -- Extended | Abhinaba Chakraborty, Wouter Tavernier, Akis Kourtis, Mario Pickavet, Andreas Oikonomakis, Didier Colle | 2025-08-11 | 下载 | The proliferation of IoT devices and advancements in network technologies have intensified the demand for real-time data processing at the network edge. |
| XDMA: A Distributed, Extensible DMA Architecture for Layout-Flexible Data Movements in Heterogeneous Multi-Accelerator SoCs | Fanchen Kong, Yunhao Deng, Xiaoling Yi, Ryan Antonio, Marian Verhelst | 2025-08-11 | 下载 | As modern AI workloads increasingly rely on heterogeneous accelerators, ensuring high-bandwidth and layout-flexible data movements between accelerator memories has become a pressing challenge. |
| Fully-Fluctuating Participation in Sleepy Consensus | Yuval Efron, Joachim Neu, Toniann Pitassi | 2025-08-11 | 下载 | Proof-of-work allows Bitcoin to boast security amidst arbitrary fluctuations in participation of miners throughout time, so long as, at any point in time, a majority of hash power is honest. |
| On the Operational Resilience of CBDC: Threats and Prospects of Formal Validation for Offline Payments | Marco Bernardo, Federico Calandra, Andrea Esposito, Francesco Fabris | 2025-08-11 | 下载 | Information and communication technologies are by now employed in most human activities, including economics and finance. Modern computers have reached an extraordinary power in terms of information p... |
| Optimizing Federated Learning for Scalable Power-demand Forecasting in Microgrids | Roopkatha Banerjee, Sampath Koti, Gyanendra Singh, Anirban Chakraborty, Gurunath Gurrala, Bhushan Jagyasi, Yogesh Simmhan | 2025-08-11 | 下载 | Real-time monitoring of power consumption in cities and micro-grids through the Internet of Things (IoT) can help forecast future demand and optimize grid operations. |
| Performance Evaluation of Brokerless Messaging Libraries | Lorenzo La Corte, Syed Aftab Rashid, Andrei-Marian Dan | 2025-08-11 | 下载 | Messaging systems are essential for efficiently transferring large volumes of data, ensuring rapid response times and high-throughput communication. |
| GPU-Accelerated Syndrome Decoding for Quantum LDPC Codes below the 63 μs Latency Threshold | Oscar Ferraz, Bruno Coutinho, Gabriel Falcao, Marco Gomes, Francisco A. Monteiro, Vitor Silva | 2025-08-11 | 下载 | This paper presents a GPU-accelerated decoder for quantum low-density parity-check (QLDPC) codes that achieves sub- μs latency, below the surface code decoder's real-time threshold demonstrated ... |
| EFU: Enforcing Federated Unlearning via Functional Encryption | Samaneh Mohammadi, Vasileios Tsouvalas, Iraklis Symeonidis, Ali Balador, Tanir Ozcelebi, Francesco Flammini, Nirvana Meratnia | 2025-08-11 | 下载 | Federated unlearning (FU) algorithms allow clients in federated settings to exercise their ''right to be forgotten'' by removing the influence of their data from a collaboratively trained model. |
| Towards Lock Modularization for Heterogeneous Environments | Hanze Zhang, Rong Chen, Haibo Chen | 2025-08-11 | 下载 | Modern hardware environments are becoming increasingly heterogeneous, leading to the emergence of applications specifically designed to exploit this heterogeneity. |
| Over-the-Top Resource Broker System for Split Computing: An Approach to Distribute Cloud Computing Infrastructure | Ingo Friese, Jochen Klaffer, Mandy Galkow-Schneider, Sergiy Melnyk, Qiuheng Zhou, Hans Dieter Schotten | 2025-08-11 | 下载 | 6G network architectures will usher in a wave of innovative services and capabilities, introducing concepts like split computing and dynamic processing nodes. |
| Perpetual exploration in anonymous synchronous networks with a Byzantine black hole | Adri Bhattacharya, Pritam Goswami, Evangelos Bampas, Partha Sarathi Mandal | 2025-08-11 | 下载 | In this paper, we investigate: ``How can a group of initially co-located mobile agents perpetually explore an unknown graph, when one stationary node occasionally behaves maliciously, under an adversa... |
| Multi-Hop Privacy Propagation for Differentially Private Federated Learning in Social Networks | Chenchen Lin, Xuehe Wang | 2025-08-11 | 下载 | Federated learning (FL) enables collaborative model training across decentralized clients without sharing local data, thereby enhancing privacy and facilitating collaboration among clients connected v... |
| Taming Cold Starts: Proactive Serverless Scheduling with Model Predictive Control | Chanh Nguyen, Monowar Bhuyan, Erik Elmroth | 2025-08-11 | 下载 | Serverless computing has transformed cloud application deployment by introducing a fine-grained, event-driven execution model that abstracts away infrastructure management. |
| Coordinated Power Management on Heterogeneous Systems | Zhong Zheng, Zhiling Lan, Xingfu Wu, Valerie E. Taylor, Michael E. Papka | 2025-08-11 | 下载 | Performance prediction is essential for energy-efficient computing in heterogeneous computing systems that integrate CPUs and GPUs. However, traditional performance modeling methods often rely on exha... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Experimental Validation of Provably Covert Communication Using Software-Defined Radio | Rohan Bali, Trevor E. Bailey, Michael S. Bullock, Boulat A. Bash | 2025-08-11 | 下载 | The fundamental information-theoretic limits of covert, or low probability of detection/intercept (LPD/LPI), communication have been extensively studied for over a decade, resulting in the square root... |
| Industrial Viewpoints on RAN Technologies for 6G | Mansoor Shafi, Erik G. Larsson, Xingqin Lin, Dorin Panaitopol, Stefan Parkvall, Flavien Ronteix-Jacquet, Antti Toskala | 2025-08-11 | 下载 | 6G standardization is to start imminently, with commercial deployments expected before 2030. Its technical components and performance requirements are the focus of this article. |
| VeriPHY: Physical Layer Signal Authentication for Wireless Communication in 5G Environments | Clifton Paul Robinson, Salvatore D'Oro, Tommaso Melodia | 2025-08-11 | 下载 | Physical layer authentication (PLA) uses inherent characteristics of the communication medium to provide secure and efficient authentication in wireless networks, bypassing the need for traditional cr... |
| Adaptive Multiple Access and Service Placement for Generative Diffusion Models | Hamidreza Mazandarani, Mohammad Farhoudi, Masoud Shokrnezhad, Tarik Taleb | 2025-08-11 | 下载 | Generative Diffusion Models (GDMs) have emerged as key components of Generative Artificial Intelligence (GenAI), offering unparalleled expressiveness and controllability for complex data generation ta... |
| Performance Evaluation of Brokerless Messaging Libraries | Lorenzo La Corte, Syed Aftab Rashid, Andrei-Marian Dan | 2025-08-11 | 下载 | Messaging systems are essential for efficiently transferring large volumes of data, ensuring rapid response times and high-throughput communication. |
| Scalable and Energy-Efficient Predictive Data Collection in Wireless Sensor Networks with Constructive Interference | Conor Muldoon | 2025-08-11 | 下载 | A new class of Wireless Sensor Network has emerged whereby multiple nodes transmit data simultaneously, exploiting constructive interference to enable data collection frameworks with low energy usage ... |
| An Experimental Reservoir-Augmented Foundation Model: 6G O-RAN Case Study | Farhad Rezazadeh, Raymond Zhao, Jiongyu Dai, Amir Ashtari Gargari, Hatim Chergui, Lingjia Liu | 2025-08-11 | 下载 | Next-generation open radio access networks (O-RAN) continuously stream tens of key performance indicators (KPIs) together with raw in-phase/quadrature (IQ) samples, yielding ultra-high-dimensional, no... |
| Over-the-Top Resource Broker System for Split Computing: An Approach to Distribute Cloud Computing Infrastructure | Ingo Friese, Jochen Klaffer, Mandy Galkow-Schneider, Sergiy Melnyk, Qiuheng Zhou, Hans Dieter Schotten | 2025-08-11 | 下载 | 6G network architectures will usher in a wave of innovative services and capabilities, introducing concepts like split computing and dynamic processing nodes. |
| Joint link scheduling and power allocation in imperfect and energy-constrained underwater wireless sensor networks | Tong Zhang, Yu Gou, Jun Liu, Shanshan Song, Tingting Yang, Jun-Hong Cui | 2025-08-11 | 下载 | Underwater wireless sensor networks (UWSNs) stand as promising technologies facilitating diverse underwater applications. However, the major design issues of the considered system are the severely lim... |
| Joint Scheduling and Resource Allocation in mmWave IAB Networks Using Deep RL | Maryam Abbasalizadeh, Sashank Narain | 2025-08-11 | 下载 | Integrated Access and Backhaul (IAB) is critical for dense 5G and beyond deployments, especially in mmWave bands where fiber backhaul is infeasible. |
| Optimization of Private Semantic Communication Performance: An Uncooperative Covert Communication Method | Wenjing Zhang, Ye Hu, Tao Luo, Zhilong Zhang, Mingzhe Chen | 2025-08-11 | 下载 | In this paper, a novel covert semantic communication framework is investigated. Within this framework, a server extracts and transmits the semantic information, i.e. |
| Achieving Fair-Effective Communications and Robustness in Underwater Acoustic Sensor Networks: A Semi-Cooperative Approach | Yu Gou, Tong Zhang, Jun Liu, Tingting Yang, Shanshan Song, Jun-Hong Cui | 2025-08-11 | 下载 | This paper investigates the fair-effective communication and robustness in imperfect and energy-constrained underwater acoustic sensor networks (IC-UASNs). |
| Enhancing Mega-Satellite Networks with Generative Semantic Communication: A Networking Perspective | Binquan Guo, Wanting Yang, Zehui Xiong, Zhou Zhang, Baosheng Li, Zhu Han, Rahim Tafazolli, Tony Q. S. Quek | 2025-08-11 | 下载 | The advance of direct satellite-to-device communication has positioned mega-satellite constellations as a cornerstone of 6G wireless communication, enabling seamless global connectivity even in remote... |
| Multimodal Remote Inference | Keyuan Zhang, Yin Sun, Bo Ji | 2025-08-11 | 下载 | We consider a remote inference system with multiple modalities, where a multimodal machine learning (ML) model performs real-time inference using features collected from remote sensors. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Towards Efficient and Practical GPU Multitasking in the Era of LLM | Jiarong Xing, Yifan Qiao, Simon Mo, Xingqi Cui, Gur-Eyal Sela, Yang Zhou, Joseph Gonzalez, Ion Stoica | 2025-08-11 | 下载 | GPU singletasking is becoming increasingly inefficient and unsustainable as hardware capabilities grow and workloads diversify. We are now at an inflection point where GPUs must embrace multitasking, ... |
| Selective KV-Cache Sharing to Mitigate Timing Side-Channels in LLM Inference | Kexin Chu, Zecheng Lin, Dawei Xiang, Zixu Shen, Jianchang Su, Cheng Chu, Yiwei Yang, Wenhui Zhang, Wenfei Wu, Wei Zhang | 2025-08-11 | 下载 | Global KV-cache sharing is an effective optimization for accelerating large language model (LLM) inference, yet it introduces an API-visible timing side channel that lets adversaries infer sensitive u... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| JSPIM: A Skew-Aware PIM Accelerator for High-Performance Databases Join and Select Operations | Sabiha Tajdari, Anastasia Ailamaki, Sandhya Dwarkadas | 2025-08-11 | 下载 | Database applications are increasingly bottlenecked by memory bandwidth and latency due to the memory wall and the limited scalability of DRAM. |
| Vector-Centric Machine Learning Systems: A Cross-Stack Approach | Wenqi Jiang | 2025-08-11 | 下载 | Today, two major trends are shaping the evolution of ML systems. First, modern AI systems are becoming increasingly complex, often integrating components beyond the model itself. |
| Profiling Concurrent Vision Inference Workloads on NVIDIA Jetson -- Extended | Abhinaba Chakraborty, Wouter Tavernier, Akis Kourtis, Mario Pickavet, Andreas Oikonomakis, Didier Colle | 2025-08-11 | 下载 | The proliferation of IoT devices and advancements in network technologies have intensified the demand for real-time data processing at the network edge. |
| A Data-driven ML Approach for Maximizing Performance in LLM-Adapter Serving | Ferran Agullo, Joan Oliveras, Chen Wang, Alberto Gutierrez-Torre, Olivier Tardieu, Alaa Youssef, Jordi Torres, Josep Ll. Berral | 2025-08-11 | 下载 | With the rapid adoption of Large Language Models (LLMs), LLM-adapters have become increasingly common, providing lightweight specialization of large-scale models. |
| Taming Cold Starts: Proactive Serverless Scheduling with Model Predictive Control | Chanh Nguyen, Monowar Bhuyan, Erik Elmroth | 2025-08-11 | 下载 | Serverless computing has transformed cloud application deployment by introducing a fine-grained, event-driven execution model that abstracts away infrastructure management. |