2025-06-06

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
InstantFT: An FPGA-Based Runtime Subsecond Fine-tuning of CNN Models	Keisuke Sugiura, Hiroki Matsutani	2025-06-06	下载	Training deep neural networks (DNNs) requires significantly more computation and memory than inference, making runtime adaptation of DNNs challenging on resource-limited IoT platforms.
RETENTION: Resource-Efficient Tree-Based Ensemble Model Acceleration with Content-Addressable Memory	Yi-Chun Liao, Chieh-Lin Tsai, Yuan-Hao Chang, Camélia Slimani, Jalil Boukhobza, Tei-Wei Kuo	2025-06-06	下载	Although deep learning has demonstrated remarkable capability in learning from unstructured data, modern tree-based ensemble models remain superior in extracting relevant information and learning from...
Lumina: Real-Time Mobile Neural Rendering by Exploiting Computational Redundancy	Yu Feng, Weikai Lin, Yuge Cheng, Zihan Liu, Jingwen Leng, Minyi Guo, Chen Chen, Shixuan Sun, Yuhao Zhu	2025-06-06	下载	3D Gaussian Splatting (3DGS) has vastly advanced the pace of neural rendering, but it remains computationally demanding on today's mobile SoCs.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Robust predicate and function computation in continuous chemical reaction networks	Kim Calabrese, David Doty, Mina Latifi	2025-06-06	下载	We initiate the study of rate-constant-independent computation of Boolean predicates and numerical functions in the continuous model of chemical reaction networks (CRNs), which model the amount of a c...
Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques	Adarsh Prasad Behera, Jaya Prakash Champati, Roberto Morabito, Sasu Tarkoma, James Gross	2025-06-06	下载	Recent progress in Language Models (LMs) has dramatically advanced the field of natural language processing (NLP), excelling at tasks like text generation, summarization, and question answering.
Cost-Efficient LLM Training with Lifetime-Aware Tensor Offloading via GPUDirect Storage	Ziqi Yuan, Haoyang Zhang, Yirui Eric Zhou, Apoorve Mohan, I-Hsin Chung, Seetharami Seelam, Jian Huang	2025-06-06	下载	We present the design and implementation of a new lifetime-aware tensor offloading framework for GPU memory expansion using low-cost PCIe-based solid-state drives (SSDs).
Performance Impact of Containerized METADOCK 2 on Heterogeneous Platforms	Antonio Jesús Banegas-Luna, Baldomero Imbernón Tudela, Carlos Martínez-Cortés, José María Cecilia, Horacio Pérez-Sánchez	2025-06-06	下载	Virtual screening (VS) is a computationally intensive process crucial for drug discovery, often requiring significant resources to analyze large chemical libraries and predict ligand-protein interacti...
Generating representative macrobenchmark microservice systems from distributed traces with Palette	Vaastav Anand, Matheus Stolet, Jonathan Mace, Antoine Kaufmann	2025-06-06	下载	Microservices are the dominant design for developing cloud systems today. Advancements for microservice need to be evaluated in representative systems, e.g.
An Active Learning-Based Streaming Pipeline for Reduced Data Training of Structure Finding Models in Neutron Diffractometry	Tianle Wang, Jorge Ramirez, Cristina Garcia-Cardona, Thomas Proffen, Shantenu Jha, Sudip K. Seal	2025-06-06	下载	Structure determination workloads in neutron diffractometry are computationally expensive and routinely require several hours to many days to determine the structure of a material from its neutron dif...
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library	Weixun Wang, Shaopan Xiong, Gengru Chen, Wei Gao, Sheng Guo, Yancheng He, Ju Huang, Jiaheng Liu, Zhendong Li, Xiaoyang Li, Zichen Liu, Haizhou Zhao, Dakai An, Lunxi Cao, Qiyang Cao, Wanxi Deng, Feilei Du, Yiliang Gu, Jiahe Li, Xiang Li, Mingjie Liu, Yijia Luo, Zihe Liu, Yadao Wang, Pei Wang, Tianyuan Wu, Yanan Wu, Yuheng Zhao, Shuaibing Zhao, Jin Yang, Siran Yang, Yingshui Tan, Huimin Yi, Yuchi Xu, Yujin Yuan, Xingyao Zhang, Lin Qu, Wenbo Su, Wei Wang, Jiamang Wang, Bo Zheng	2025-06-06	下载	We introduce ROLL, an efficient, scalable, and user-friendly library designed for Reinforcement Learning Optimization for Large-scale Learning.
Perfect Matching with Few Link Activations	Hugo Mirault, Peter Robinson, Ming Ming Tan, Xianbin Zhu	2025-06-06	下载	We consider the problem of computing a perfect matching problem in a synchronous distributed network, where the network topology corresponds to a complete bipartite graph.
Mitigating Catastrophic Forgetting with Adaptive Transformer Block Expansion in Federated Fine-Tuning	Yujia Huo, Jianchun Liu, Hongli Xu, Zhenguo Ma, Shilong Wang, Liusheng Huang	2025-06-06	下载	Federated fine-tuning (FedFT) of large language models (LLMs) has emerged as a promising solution for adapting models to distributed data environments while ensuring data privacy.
BestServe: Serving Strategies with Optimal Goodput in Collocation and Disaggregation Architectures	Xiannan Hu, Tianyou Zeng, Xiaoming Yuan, Liwei Song, Guangyuan Zhang, Bangzheng He	2025-06-06	下载	Serving large language models (LLMs) to millions of users requires efficient resource allocation and parallelism strategies. It is a labor intensive trial-and-error process to find such a strategy.
Malicious node aware wireless multi hop networks: a systematic review of the literature and recommendations for future research	Shahram Pourdehghan, Nahideh Derakhshanfard	2025-06-06	下载	Wireless communication provides great advantages that are not available through their wired counterparts such as flexibility, ease of deployment and use, cost reductions, and convenience.
Resilient Auto-Scaling of Microservice Architectures with Efficient Resource Management	Hussain Ahmad, Christoph Treude, Markus Wagner, Claudia Szabo	2025-06-06	下载	Horizontal Pod Auto-scalers (HPAs) are crucial for managing resource allocation in microservice architectures to handle fluctuating workloads.
EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical Model	Alyssa Pinnock, Shakya Jayakody, Kawsher A Roxy, Md Rubel Ahmed	2025-06-06	下载	This paper introduces EdgeProfiler, a fast profiling framework designed for evaluating lightweight Large Language Models (LLMs) on edge systems.
FedShield-LLM: A Secure and Scalable Federated Fine-Tuned Large Language Model	Md Jueal Mia, M. Hadi Amini	2025-06-06	下载	Federated Learning (FL) offers a decentralized framework for training and fine-tuning Large Language Models (LLMs) by leveraging computational resources across organizations while keeping sensitive da...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Hierarchical and Collaborative LLM-Based Control for Multi-UAV Motion and Communication in Integrated Terrestrial and Non-Terrestrial Networks	Zijiang Yan, Hao Zhou, Jianhua Pei, Hina Tabassum	2025-06-06	下载	Unmanned aerial vehicles (UAVs) have been widely adopted in various real-world applications. However, the control and optimization of multi-UAV systems remain a significant challenge, particularly in ...
Edge-Enabled Collaborative Object Detection for Real-Time Multi-Vehicle Perception	Everett Richards, Bipul Thapa, Lena Mashayekhy	2025-06-06	下载	Accurate and reliable object detection is critical for ensuring the safety and efficiency of Connected Autonomous Vehicles (CAVs). Traditional on-board perception systems have limited accuracy due to ...
Steps towards an Ecology for the Internet	Anil Madhavapeddy, Sam Reynolds, Alec P. Christie, David A. Coomes, Michael W. Dales, Patrick Ferris, Ryan Gibb, Hamed Haddadi, Sadiq Jaffer, Josh Millar, Cyrus Omar, William J. Sutherland, Jon Crowcroft	2025-06-06	下载	The Internet has grown from a humble set of protocols for end-to-end connectivity into a critical global system with no builtin "immune system".
On the Suitability of Wi-Fi for Interconnecting Moving Equipment in Industrial Environments	Pietro Chiavassa, Stefano Scanzio, Gianluca Cena	2025-06-06	下载	To ensure an unprecedented degree of flexibility, next-generation Industry 4.0/5.0 production plants increasingly rely on mobile devices, e.g., autonomous mobile robots and wearables.
Nonlinear symbols combining for Power Amplifier-distorted OFDM signal reception	Pawel Kryszkiewicz, Hanna Bogucka	2025-06-06	下载	Nonlinear distortion of a multicarrier signal by a transmitter Power Amplifier (PA) can be a serious problem when designing new highly energy-efficient wireless systems.
Pegasus: A Universal Framework for Scalable Deep Learning Inference on the Dataplane	Yinchao Zhang, Su Yao, Yong Feng, Kang Chen, Tong Li, Zhuotao Liu, Yi Zhao, Lexuan Zhang, Xiangyu Gao, Feng Xiong, Qi Li, Ke Xu	2025-06-06	下载	The paradigm of Intelligent DataPlane (IDP) embeds deep learning (DL) models on the network dataplane to enable intelligent traffic analysis at line-speed.

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Efficient Memory Tiering in a Virtual Machine	Chandra Prakash, Aravinda Prasad, Sandeep Kumar, Sreenivas Subramoney	2025-06-06	下载	Memory tiering is the norm to effectively tackle the increasing server memory total cost of ownership (TCO) and the growing data demands of modern data center workloads.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Cost-Efficient LLM Training with Lifetime-Aware Tensor Offloading via GPUDirect Storage	Ziqi Yuan, Haoyang Zhang, Yirui Eric Zhou, Apoorve Mohan, I-Hsin Chung, Seetharami Seelam, Jian Huang	2025-06-06	下载	We present the design and implementation of a new lifetime-aware tensor offloading framework for GPU memory expansion using low-cost PCIe-based solid-state drives (SSDs).
Performance Impact of Containerized METADOCK 2 on Heterogeneous Platforms	Antonio Jesús Banegas-Luna, Baldomero Imbernón Tudela, Carlos Martínez-Cortés, José María Cecilia, Horacio Pérez-Sánchez	2025-06-06	下载	Virtual screening (VS) is a computationally intensive process crucial for drug discovery, often requiring significant resources to analyze large chemical libraries and predict ligand-protein interacti...
BestServe: Serving Strategies with Optimal Goodput in Collocation and Disaggregation Architectures	Xiannan Hu, Tianyou Zeng, Xiaoming Yuan, Liwei Song, Guangyuan Zhang, Bangzheng He	2025-06-06	下载	Serving large language models (LLMs) to millions of users requires efficient resource allocation and parallelism strategies. It is a labor intensive trial-and-error process to find such a strategy.
EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical Model	Alyssa Pinnock, Shakya Jayakody, Kawsher A Roxy, Md Rubel Ahmed	2025-06-06	下载	This paper introduces EdgeProfiler, a fast profiling framework designed for evaluating lightweight Large Language Models (LLMs) on edge systems.