2025-10-15

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
ArtNet: Hierarchical Clustering-Based Artificial Netlist Generator for ML and DTCO Application	Andrew B. Kahng. Seokhyeong Kang, Seonghyeon Park, Dooseok Yoon	2025-10-15	下载	In advanced nodes, optimization of power, performance and area (PPA) has become highly complex and challenging. Machine learning (ML) and design-technology co-optimization (DTCO) provide promising mit...
F-BFQ: Flexible Block Floating-Point Quantization Accelerator for LLMs	Jude Haris, José Cano	2025-10-15	下载	Large Language Models (LLMs) have become increasingly prominent for daily tasks, from improving sound-totext translation to generating additional frames for the latest video games.
Energy-Efficient FPGA Framework for Non-Quantized Convolutional Neural Networks	Angelos Athanasiadis, Nikolaos Tampouratzis, Ioannis Papaefstathiou	2025-10-15	下载	The growing demand for real-time processing in artificial intelligence applications, particularly those involving Convolutional Neural Networks (CNNs), has highlighted the need for efficient computati...
D-com: Accelerating Iterative Processing to Enable Low-rank Decomposition of Activations	Faraz Tahmasebi, Michael Pelluer, Hyoukjun Kwon	2025-10-15	下载	The computation and memory costs of large language models kept increasing over last decade, which reached over the scale of 1T parameters. To address the challenges from the large scale models, model ...
ShuffleV: A Microarchitectural Defense Strategy against Electromagnetic Side-Channel Attacks in Microprocessors	Nuntipat Narkthong, Yukui Luo, Xiaolin Xu	2025-10-15	下载	The run-time electromagnetic (EM) emanation of microprocessors presents a side-channel that leaks the confidentiality of the applications running on them.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Privacy-Preserving and Incentive-Driven Relay-Based Framework for Cross-Domain Blockchain Interoperability	Saeed Moradi, Koosha Esmaeilzadeh Khorasani, Sara Rouhani	2025-10-15	下载	Interoperability is essential for transforming blockchains from isolated networks into collaborative ecosystems, unlocking their full potential.
Distributed-Memory Parallel Algorithms for Fixed-Radius Near Neighbor Graph Construction	Gabriel Raulet, Dmitriy Morozov, Aydin Buluc, Katherine Yelick	2025-10-15	下载	Computing fixed-radius near-neighbor graphs is an important first step for many data analysis algorithms. Near-neighbor graphs connect points that are close under some metric, endowing point clouds wi...
Cortex: Workflow-Aware Resource Pooling and Scheduling for Agentic Serving	Nikos Pagonas, Yeounoh Chung, Kostis Kaffes, Arvind Krishnamurthy	2025-10-15	下载	We introduce Cortex, a prototype workflow-aware serving platform designed for agentic workloads. The core principle of Cortex is stage isolation: it provisions dedicated resource pools for each distin...
FedHFT: Efficient Federated Finetuning with Heterogeneous Edge Clients	Fatih Ilhan, Selim Furkan Tekin, Tiansheng Huang, Gaowen Liu, Ramana Kompella, Greg Eisenhauer, Yingyan Celine Lin, Calton Pu, Ling Liu	2025-10-15	下载	Fine-tuning pre-trained large language models (LLMs) has become a common practice for personalized natural language understanding (NLU) applications on downstream tasks and domain-specific datasets.
Anonymized Network Sensing using C++26 std::execution on GPUs	Michael Mandulak, Sayan Ghosh, S M Ferdous, Mahantesh Halappanavar, George Slota	2025-10-15	下载	Large-scale network sensing plays a vital role in network traffic analysis and characterization. As network packet data grows increasingly large, parallel methods have become mainstream for network an...
Efficiently Executing High-throughput Lightweight LLM Inference Applications on Heterogeneous Opportunistic GPU Clusters with Pervasive Context Management	Thanh Son Phung, Douglas Thain	2025-10-15	下载	The rise of Generative AI introduces a new class of HPC workloads that integrates lightweight LLMs with traditional high-throughput applications to accelerate scientific discovery.
On-Chain Decentralized Learning and Cost-Effective Inference for DeFi Attack Mitigation	Abdulrahman Alhaidari, Balaji Palanisamy, Prashant Krishnamurthy	2025-10-15	下载	Billions of dollars are lost every year in DeFi platforms by transactions exploiting business logic or accounting vulnerabilities. Existing defenses focus on static code analysis, public mempool scree...
Tight Conditions for Binary-Output Tasks under Crashes	Timothé Albouy, Antonio Fernández Anta, Chryssis Georgiou, Nicolas Nicolaou, Junlang Wang	2025-10-15	下载	This paper explores necessary and sufficient system conditions to solve distributed tasks with binary outputs (\textit{i.e.}, tasks with output values in $\{0,1\}$ ).
FIRST: Federated Inference Resource Scheduling Toolkit for Scientific AI Model Access	Aditya Tanikanti, Benoit Côté, Yanfei Guo, Le Chen, Nickolaus Saint, Ryan Chard, Ken Raffenetti, Rajeev Thakur, Thomas Uram, Ian Foster, Michael E. Papka, Venkatram Vishwanath	2025-10-15	下载	We present the Federated Inference Resource Scheduling Toolkit (FIRST), a framework enabling Inference-as-a-Service across distributed High-Performance Computing (HPC) clusters.
Adaptive Rescheduling in Prefill-Decode Disaggregated LLM Inference	Zhibin Wang, Zetao Hong, Xue Li, Zibo Wang, Shipeng Li, Qingkai Meng, Qing Wang, Chengying Huan, Rong Gu, Sheng Zhong, Chen Tian	2025-10-15	下载	Large Language Model (LLM) inference has emerged as a fundamental paradigm. In real-world scenarios, variations in output length cause severe workload imbalance in the decode phase, particularly for l...
Service-Level Energy Modeling and Experimentation for Cloud-Native Microservices	Julian Legler, Sebastian Werner, Maria C. Borges, Stefan Tai	2025-10-15	下载	Microservice architectures have become the dominant paradigm for cloud-native systems, offering flexibility and scalability. However, this shift has also led to increased demand for cloud resources, c...
Verification Challenges in Sparse Matrix Vector Multiplication in High Performance Computing: Part I	Junchao Zhang	2025-10-15	下载	Sparse matrix vector multiplication (SpMV) is a fundamental kernel in scientific codes that rely on iterative solvers. In this first part of our work, we present both a sequential and a basic MPI para...
VSS Challenge Problem: Verifying the Correctness of AllReduce Algorithms in the MPICH Implementation of MPI	Paul D. Hovland	2025-10-15	下载	We describe a challenge problem for verification based on the MPICH implementation of MPI. The MPICH implementation includes several algorithms for allreduce, all of which should be functionally equiv...
F-BFQ: Flexible Block Floating-Point Quantization Accelerator for LLMs	Jude Haris, José Cano	2025-10-15	下载	Large Language Models (LLMs) have become increasingly prominent for daily tasks, from improving sound-totext translation to generating additional frames for the latest video games.
ACE-GNN: Adaptive GNN Co-Inference with System-Aware Scheduling in Dynamic Edge Environments	Ao Zhou, Jianlei Yang, Tong Qiao, Yingjie Qi, Xinming Wei, Cenlin Duan, Weisheng Zhao, Chunming Hu	2025-10-15	下载	The device-edge co-inference paradigm effectively bridges the gap between the high resource demands of Graph Neural Networks (GNNs) and limited device resources, making it a promising solution for adv...
Distributed Reductions for the Maximum Weight Independent Set Problem	Jannick Borowitz, Ernestine Großmann, Mattthias Schimek	2025-10-15	下载	Finding maximum-weight independent sets in graphs is an important NP-hard optimization problem. Given a vertex-weighted graph $G$ , the task is to find a subset of pairwise non-adjacent vertices of $G$ ...
BanaServe: Unified KV Cache and Dynamic Module Migration for Balancing Disaggregated LLM Serving in AI Infrastructure	Yiyuan He, Minxian Xu, Jingfeng Wu, Jianmin Hu, Chong Ma, Min Shen, Le Chen, Chengzhong Xu, Lin Qu, Kejiang Ye	2025-10-15	下载	Large language models (LLMs) are increasingly deployed in AI infrastructure, driving the need for high throughput, resource efficient serving systems.
Scrutiny new framework in integrated distributed reliable systems	Mehdi Zekriyapanah Gashti	2025-10-15	下载	In this paper we represent a new framework for integrated distributed systems. In the proposed framework we have used three parts to increase Satisfaction and Performance of this framework.
Towards Affordable, Adaptive and Automatic GNN Training on CPU-GPU Heterogeneous Platforms	Tong Qiao, Ao Zhou, Yingjie Qi, Yiou Wang, Han Wan, Jianlei Yang, Chunming Hu	2025-10-15	下载	Graph Neural Networks (GNNs) have been widely adopted due to their strong performance. However, GNN training often relies on expensive, high-performance computing platforms, limiting accessibility for...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
DiffLoc: Diffusion Model-Based High-Precision Positioning for 6G Networks	Taekyun Lee, Tommaso Balercia, Heasung Kim, Hyeji Kim, Jeffrey G. Andrews	2025-10-15	下载	This paper introduces a novel framework for high-accuracy outdoor user equipment (UE) positioning that applies a conditional generative diffusion model directly to high-dimensional massive MIMO channe...
Pilot Assignment for Distributed Massive MIMO Based on Channel Estimation Error Minimization	Mohd Saif Ali Khan, Karthik RM, Samar Agnihotri	2025-10-15	下载	Pilot contamination remains a major bottleneck in realizing the full potential of distributed massive MIMO systems. We propose two dynamic and scalable pilot assignment schemes designed for practical ...
Investigating Web Content Delivery Performance over Starlink	Rohan Bose, Jinwei Zhao, Tanya Shreedhar, Jianping Pan, Nitinder Mohan	2025-10-15	下载	Low Earth Orbit (LEO) satellite ISPs promise universal Internet connectivity, yet their interaction with content delivery remains poorly understood.
Optimize Replica Server Placement in a Satellite Network	Zhiyuan He, Yi Xu, Cheng Luo, Lili Qiu, Yuqing Yang	2025-10-15	下载	Satellite communication offers Internet connectivity to remote locations, such as villages, deserts, mountains, and at sea. However, transmitting content over satellite networks is significantly more ...
Beyond Lamport, Towards Probabilistic Fair Ordering	Muhammad Haseeb, Jinkun Geng, Radhika Mittal, Aurojit Panda, Srinivas Narayana, Anirudh Sivaraman	2025-10-15	下载	A growing class of applications demands \emph{fair ordering} of events, which ensures that events generated earlier are processed before later events.
An LLM-Powered AI Agent Framework for Holistic IoT Traffic Interpretation	Daniel Adu Worae, Spyridon Mastorakis	2025-10-15	下载	Internet of Things (IoT) networks generate diverse and high-volume traffic that reflects both normal activity and potential threats. Deriving meaningful insight from such telemetry requires cross-laye...
NetMCP: Network-Aware Model Context Protocol Platform for LLM Capability Extension	Enhan Li, Hongyang Du, Kaibin Huang	2025-10-15	下载	Large Language Models (LLMs) remain static in functionality after training, and extending their capabilities requires integration with external data, computation, and services.
Mobile Coverage Analysis using Crowdsourced Data	Timothy Wong, Tom Freeman, Joseph Feehily	2025-10-15	下载	Effective assessment of mobile network coverage and the precise identification of service weak spots are paramount for network operators striving to enhance user Quality of Experience (QoE).
Optimizing Storage Overhead of User Behavior Log for ML-embedded Mobile Apps	Chen Gong, Yan Zhuang, Zhenzhe Zheng, Yiliu Chen, Sheng Wang, Fan Wu, Guihai Chen	2025-10-15	下载	Machine learning (ML) models are increasingly integrated into modern mobile apps to enable personalized and intelligent services. These models typically rely on rich input features derived from histor...
Towards Trusted Service Monitoring: Verifiable Service Level Agreements	Fernando Castillo, Eduardo Brito, Sebastian Werner, Pille Pullonen-Raudvere, Jonathan Heiss	2025-10-15	下载	Service Level Agreement (SLA) monitoring in service-oriented environments suffers from inherent trust conflicts when providers self-report metrics, creating incentives to underreport violations.
Automated Network Protocol Testing with LLM Agents	Yunze Wei, Kaiwen Wei, Shibo Du, Jianyu Wang, Zhangzhong Liu, Yawen Wang, Zhanyou Li, Congcong Miao, Xiaohui Xie, Yong Cui	2025-10-15	下载	Network protocol testing is fundamental for modern network infrastructure. However, traditional network protocol testing methods are labor-intensive and error-prone, requiring manual interpretation of...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Accelerated Feature Detectors for Visual SLAM: A Comparative Study of FPGA vs GPU	Ruiqi Ye, Mikel Luján	2025-10-15	下载	Feature detection is a common yet time-consuming module in Simultaneous Localization and Mapping (SLAM) implementations, which are increasingly deployed on power-constrained platforms, such as drones.
D-com: Accelerating Iterative Processing to Enable Low-rank Decomposition of Activations	Faraz Tahmasebi, Michael Pelluer, Hyoukjun Kwon	2025-10-15	下载	The computation and memory costs of large language models kept increasing over last decade, which reached over the scale of 1T parameters. To address the challenges from the large scale models, model ...