Skip to content

2025-06-06

cs.AR - Architecture

标题作者发布日期PDF摘要
InstantFT: An FPGA-Based Runtime Subsecond Fine-tuning of CNN ModelsKeisuke Sugiura, Hiroki Matsutani2025-06-06下载Training deep neural networks (DNNs) requires significantly more computation and memory than inference, making runtime adaptation of DNNs challenging on resource-limited IoT platforms.
RETENTION: Resource-Efficient Tree-Based Ensemble Model Acceleration with Content-Addressable MemoryYi-Chun Liao, Chieh-Lin Tsai, Yuan-Hao Chang, Camélia Slimani, Jalil Boukhobza, Tei-Wei Kuo2025-06-06下载Although deep learning has demonstrated remarkable capability in learning from unstructured data, modern tree-based ensemble models remain superior in extracting relevant information and learning from...
Lumina: Real-Time Mobile Neural Rendering by Exploiting Computational RedundancyYu Feng, Weikai Lin, Yuge Cheng, Zihan Liu, Jingwen Leng, Minyi Guo, Chen Chen, Shixuan Sun, Yuhao Zhu2025-06-06下载3D Gaussian Splatting (3DGS) has vastly advanced the pace of neural rendering, but it remains computationally demanding on today's mobile SoCs.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Robust predicate and function computation in continuous chemical reaction networksKim Calabrese, David Doty, Mina Latifi2025-06-06下载We initiate the study of rate-constant-independent computation of Boolean predicates and numerical functions in the continuous model of chemical reaction networks (CRNs), which model the amount of a c...
Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical TechniquesAdarsh Prasad Behera, Jaya Prakash Champati, Roberto Morabito, Sasu Tarkoma, James Gross2025-06-06下载Recent progress in Language Models (LMs) has dramatically advanced the field of natural language processing (NLP), excelling at tasks like text generation, summarization, and question answering.
Cost-Efficient LLM Training with Lifetime-Aware Tensor Offloading via GPUDirect StorageZiqi Yuan, Haoyang Zhang, Yirui Eric Zhou, Apoorve Mohan, I-Hsin Chung, Seetharami Seelam, Jian Huang2025-06-06下载We present the design and implementation of a new lifetime-aware tensor offloading framework for GPU memory expansion using low-cost PCIe-based solid-state drives (SSDs).
Performance Impact of Containerized METADOCK 2 on Heterogeneous PlatformsAntonio Jesús Banegas-Luna, Baldomero Imbernón Tudela, Carlos Martínez-Cortés, José María Cecilia, Horacio Pérez-Sánchez2025-06-06下载Virtual screening (VS) is a computationally intensive process crucial for drug discovery, often requiring significant resources to analyze large chemical libraries and predict ligand-protein interacti...
Generating representative macrobenchmark microservice systems from distributed traces with PaletteVaastav Anand, Matheus Stolet, Jonathan Mace, Antoine Kaufmann2025-06-06下载Microservices are the dominant design for developing cloud systems today. Advancements for microservice need to be evaluated in representative systems, e.g.
An Active Learning-Based Streaming Pipeline for Reduced Data Training of Structure Finding Models in Neutron DiffractometryTianle Wang, Jorge Ramirez, Cristina Garcia-Cardona, Thomas Proffen, Shantenu Jha, Sudip K. Seal2025-06-06下载Structure determination workloads in neutron diffractometry are computationally expensive and routinely require several hours to many days to determine the structure of a material from its neutron dif...
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling LibraryWeixun Wang, Shaopan Xiong, Gengru Chen, Wei Gao, Sheng Guo, Yancheng He, Ju Huang, Jiaheng Liu, Zhendong Li, Xiaoyang Li, Zichen Liu, Haizhou Zhao, Dakai An, Lunxi Cao, Qiyang Cao, Wanxi Deng, Feilei Du, Yiliang Gu, Jiahe Li, Xiang Li, Mingjie Liu, Yijia Luo, Zihe Liu, Yadao Wang, Pei Wang, Tianyuan Wu, Yanan Wu, Yuheng Zhao, Shuaibing Zhao, Jin Yang, Siran Yang, Yingshui Tan, Huimin Yi, Yuchi Xu, Yujin Yuan, Xingyao Zhang, Lin Qu, Wenbo Su, Wei Wang, Jiamang Wang, Bo Zheng2025-06-06下载We introduce ROLL, an efficient, scalable, and user-friendly library designed for Reinforcement Learning Optimization for Large-scale Learning.
Perfect Matching with Few Link ActivationsHugo Mirault, Peter Robinson, Ming Ming Tan, Xianbin Zhu2025-06-06下载We consider the problem of computing a perfect matching problem in a synchronous distributed network, where the network topology corresponds to a complete bipartite graph.
Mitigating Catastrophic Forgetting with Adaptive Transformer Block Expansion in Federated Fine-TuningYujia Huo, Jianchun Liu, Hongli Xu, Zhenguo Ma, Shilong Wang, Liusheng Huang2025-06-06下载Federated fine-tuning (FedFT) of large language models (LLMs) has emerged as a promising solution for adapting models to distributed data environments while ensuring data privacy.
BestServe: Serving Strategies with Optimal Goodput in Collocation and Disaggregation ArchitecturesXiannan Hu, Tianyou Zeng, Xiaoming Yuan, Liwei Song, Guangyuan Zhang, Bangzheng He2025-06-06下载Serving large language models (LLMs) to millions of users requires efficient resource allocation and parallelism strategies. It is a labor intensive trial-and-error process to find such a strategy.
Malicious node aware wireless multi hop networks: a systematic review of the literature and recommendations for future researchShahram Pourdehghan, Nahideh Derakhshanfard2025-06-06下载Wireless communication provides great advantages that are not available through their wired counterparts such as flexibility, ease of deployment and use, cost reductions, and convenience.
Resilient Auto-Scaling of Microservice Architectures with Efficient Resource ManagementHussain Ahmad, Christoph Treude, Markus Wagner, Claudia Szabo2025-06-06下载Horizontal Pod Auto-scalers (HPAs) are crucial for managing resource allocation in microservice architectures to handle fluctuating workloads.
EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical ModelAlyssa Pinnock, Shakya Jayakody, Kawsher A Roxy, Md Rubel Ahmed2025-06-06下载This paper introduces EdgeProfiler, a fast profiling framework designed for evaluating lightweight Large Language Models (LLMs) on edge systems.
FedShield-LLM: A Secure and Scalable Federated Fine-Tuned Large Language ModelMd Jueal Mia, M. Hadi Amini2025-06-06下载Federated Learning (FL) offers a decentralized framework for training and fine-tuning Large Language Models (LLMs) by leveraging computational resources across organizations while keeping sensitive da...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Hierarchical and Collaborative LLM-Based Control for Multi-UAV Motion and Communication in Integrated Terrestrial and Non-Terrestrial NetworksZijiang Yan, Hao Zhou, Jianhua Pei, Hina Tabassum2025-06-06下载Unmanned aerial vehicles (UAVs) have been widely adopted in various real-world applications. However, the control and optimization of multi-UAV systems remain a significant challenge, particularly in ...
Edge-Enabled Collaborative Object Detection for Real-Time Multi-Vehicle PerceptionEverett Richards, Bipul Thapa, Lena Mashayekhy2025-06-06下载Accurate and reliable object detection is critical for ensuring the safety and efficiency of Connected Autonomous Vehicles (CAVs). Traditional on-board perception systems have limited accuracy due to ...
Steps towards an Ecology for the InternetAnil Madhavapeddy, Sam Reynolds, Alec P. Christie, David A. Coomes, Michael W. Dales, Patrick Ferris, Ryan Gibb, Hamed Haddadi, Sadiq Jaffer, Josh Millar, Cyrus Omar, William J. Sutherland, Jon Crowcroft2025-06-06下载The Internet has grown from a humble set of protocols for end-to-end connectivity into a critical global system with no builtin "immune system".
On the Suitability of Wi-Fi for Interconnecting Moving Equipment in Industrial EnvironmentsPietro Chiavassa, Stefano Scanzio, Gianluca Cena2025-06-06下载To ensure an unprecedented degree of flexibility, next-generation Industry 4.0/5.0 production plants increasingly rely on mobile devices, e.g., autonomous mobile robots and wearables.
Nonlinear symbols combining for Power Amplifier-distorted OFDM signal receptionPawel Kryszkiewicz, Hanna Bogucka2025-06-06下载Nonlinear distortion of a multicarrier signal by a transmitter Power Amplifier (PA) can be a serious problem when designing new highly energy-efficient wireless systems.
Pegasus: A Universal Framework for Scalable Deep Learning Inference on the DataplaneYinchao Zhang, Su Yao, Yong Feng, Kang Chen, Tong Li, Zhuotao Liu, Yi Zhao, Lexuan Zhang, Xiangyu Gao, Feng Xiong, Qi Li, Ke Xu2025-06-06下载The paradigm of Intelligent DataPlane (IDP) embeds deep learning (DL) models on the network dataplane to enable intelligent traffic analysis at line-speed.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Efficient Memory Tiering in a Virtual MachineChandra Prakash, Aravinda Prasad, Sandeep Kumar, Sreenivas Subramoney2025-06-06下载Memory tiering is the norm to effectively tackle the increasing server memory total cost of ownership (TCO) and the growing data demands of modern data center workloads.

cs.PF - Performance

标题作者发布日期PDF摘要
Cost-Efficient LLM Training with Lifetime-Aware Tensor Offloading via GPUDirect StorageZiqi Yuan, Haoyang Zhang, Yirui Eric Zhou, Apoorve Mohan, I-Hsin Chung, Seetharami Seelam, Jian Huang2025-06-06下载We present the design and implementation of a new lifetime-aware tensor offloading framework for GPU memory expansion using low-cost PCIe-based solid-state drives (SSDs).
Performance Impact of Containerized METADOCK 2 on Heterogeneous PlatformsAntonio Jesús Banegas-Luna, Baldomero Imbernón Tudela, Carlos Martínez-Cortés, José María Cecilia, Horacio Pérez-Sánchez2025-06-06下载Virtual screening (VS) is a computationally intensive process crucial for drug discovery, often requiring significant resources to analyze large chemical libraries and predict ligand-protein interacti...
BestServe: Serving Strategies with Optimal Goodput in Collocation and Disaggregation ArchitecturesXiannan Hu, Tianyou Zeng, Xiaoming Yuan, Liwei Song, Guangyuan Zhang, Bangzheng He2025-06-06下载Serving large language models (LLMs) to millions of users requires efficient resource allocation and parallelism strategies. It is a labor intensive trial-and-error process to find such a strategy.
EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical ModelAlyssa Pinnock, Shakya Jayakody, Kawsher A Roxy, Md Rubel Ahmed2025-06-06下载This paper introduces EdgeProfiler, a fast profiling framework designed for evaluating lightweight Large Language Models (LLMs) on edge systems.

基于 VitePress 构建