2025-10-31

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Scalable Processing-Near-Memory for 1M-Token LLM Inference: CXL-Enabled KV-Cache Management Beyond GPU Limits	Dowon Kim, MinJae Lee, Janghyeon Kim, HyuckSung Kwon, Hyeonggyu Jeong, Sang-Soo Park, Minyong Yoon, Si-Dong Roh, Yongsuk Kwon, Jinin So, Jungwook Choi	2025-10-31	下载	The expansion of context windows in large language models (LLMs) to multi-million tokens introduces severe memory and compute bottlenecks, particularly in managing the growing Key-Value (KV) cache.
PEARL: Power- and Energy-Aware Multicore Intermittent Computing	Khakim Akhunov, Eren Yildiz, Kasim Sinan Yildirim	2025-10-31	下载	Low-power multicore platforms are suitable for running data-intensive tasks in parallel, but they are highly inefficient for computing on intermittent power.
H-FA: A Hybrid Floating-Point and Logarithmic Approach to Hardware Accelerated FlashAttention	Kosmas Alexandridis, Giorgos Dimitrakopoulos	2025-10-31	下载	Transformers have significantly advanced AI and machine learning through their powerful attention mechanism. However, computing attention on long sequences can become a computational bottleneck.
A Memory-Efficient Retrieval Architecture for RAG-Enabled Wearable Medical LLMs-Agents	Zhipeng Liao, Kunming Shao, Jiangnan Yu, Liang Zhao, Tim Kwang-Ting Cheng, Chi-Ying Tsui, Jie Yang, Mohamad Sawan	2025-10-31	下载	With powerful and integrative large language models (LLMs), medical AI agents have demonstrated unique advantages in providing personalized medical consultations, continuous health monitoring, and pre...
Descriptor-Based Object-Aware Memory Systems: A Comprehensive Review	Dong Tong	2025-10-31	下载	The security and efficiency of modern computing systems are fundamentally undermined by the absence of a native architectural mechanism to propagate high-level program semantics, such as object identi...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Tetris: An SLA-aware Application Placement Strategy in the Edge-Cloud Continuum	Lucas Almeida, Maycon Peixoto	2025-10-31	下载	An Edge-Cloud Continuum integrates edge and cloud resources to provide a flexible and scalable infrastructure. This paradigm can minimize latency by processing data closer to the source at the edge wh...
LongCat-Flash-Omni Technical Report	Meituan LongCat Team, Bairui Wang, Bayan, Bin Xiao, Bo Zhang, Bolin Rong, Borun Chen, Chang Wan, Chao Zhang, Chen Huang, Chen Chen, Chen Chen, Chengxu Yang, Chengzuo Yang, Cong Han, Dandan Peng, Delian Ruan, Detai Xin, Disong Wang, Dongchao Yang, Fanfan Liu, Fengjiao Chen, Fengyu Yang, Gan Dong, Gang Huang, Gang Xu, Guanglu Wan, Guoqiang Tan, Guoqiao Yu, Haibo Qiu, Hao Lu, Hongbo Liu, Hongyu Xiang, Jiaheng Wu, Jian Yang, Jiaxing Liu, Jing Huang, Jingang Wang, Jinrui Ding, Juchao Jiang, Jun Kuang, Jun Wang, Junhui Mei, Ke Ding, Kefeng Zhang, Lei Chen, Liang Shi, Limeng Qiao, Liming Zheng, Lin Ma, Liuyang Guo, Liya Ma, Luying Sun, Man Gao, Mengshen Zhu, Miao Cao, Minliang Lin, Nuo Xu, Peng Shi, Qi Zhang, Qian Fang, Qian Wang, Qian Yang, Quanxiu Wang, Rongxiang Weng, Rongxin Guo, Ruoxuan Liang, Senbin Yang, Shanbo Xu, Shanglin Lei, Shengze Ye, Shimin Chen, Shuaiqi Chen, Shujie Hu, Shuo Li, Siqi Yang, Siyu Xu, Siyu Ren, Song Li, Songxiang Liu, Tianhao Bai, Tianye Dai, Wei Hong, Wei Wang, Weixiao Zhao, Wengang Cao, Wenlong Zhu, Wenlong He, Xi Su, Xi Nan, Xiaohan Zhao, Xiaohao Wang, Xiaoyu Zhao, Xiaoyu Wang, Xiaoyu Li, Xin Pan, Xin Chen, Xiusong Sun, Xu Xiang, Xudong Xing, Xuezhi Cao, Xunliang Cai, Yang Yang, Yanli Tan, Yao Yao, Yerui Sun, Yi Chen, Yifan Lu, Yin Gong, Yining Zhang, Yitian Chen, Yiyang Gan, Yuchen Tang, Yuchen Xie, Yueqian Wang, Yuewen Zheng, Yufei Zhang, Yufeng Zhong, Yulei Qian, Yuqi Peng, Yuqian Li, Yuwei Jiang, Zeyang Hu, Zheng Zhang, Zhengkun Tian, Zhiqing Hong, Zhixiong Zeng, Zhuqi Mi, Ziran Li, Ziwen Wang, Ziyi Zhao, Ziyuan Zhuang, Zizhe Zhao	2025-10-31	下载	We introduce LongCat-Flash-Omni, a state-of-the-art open-source omni-modal model with 560 billion parameters, excelling at real-time audio-visual interaction.
COOL Is Optimal in Error-Free Asynchronous Byzantine Agreement	Jinyuan Chen	2025-10-31	下载	COOL (Chen'21) is an error-free, information-theoretically secure Byzantine agreement (BA) protocol proven to achieve BA consensus in the synchronous setting for an $\ell$ -bit message, with a total co...
Machine learning-based cloud resource allocation algorithms: a comprehensive comparative review	Deep Bodra, Sushil Khairnar	2025-10-31	下载	Cloud resource allocation has emerged as a major challenge in modern computing environments, with organizations struggling to manage complex, dynamic workloads while optimizing performance and cost ef...
Fix: externalizing network I/O in serverless computing	Yuhan Deng, Akshay Srivatsan, Sebastian Ingino, Francis Chua, Yasmine Mitchell, Matthew Vilaysack, Keith Winstein	2025-10-31	下载	We describe a system for serverless computing where users, programs, and the underlying platform share a common representation of a computation: a deterministic procedure, run in an environment ...
RDMA Point-to-Point Communication for LLM Systems	Nandor Licker, Kevin Hu, Vladimir Zaytsev, Lequn Chen	2025-10-31	下载	Emerging Large Language Model (LLM) system patterns, such as disaggregated inference, Mixture-of-Experts (MoE) routing, and asynchronous reinforcement fine-tuning, require flexible point-to-point comm...
ML-Based Optimum Sub-system Size Heuristic for the GPU Implementation of the Tridiagonal Partition Method	Milena Veneva	2025-10-31	下载	This paper presents a machine learning (ML)-based heuristic for finding the optimum sub-system size for the CUDA implementation of the parallel partition algorithm.
Dynamic Service Scheduling and Resource Management in Energy-Harvesting Multi-access Edge Computing	Shuyi Chen, Panagiotis Oikonomou, Zhengchang Hua, Nikos Tziritas, Karim Djemame, Nan Zhang, Georgios Theodoropoulos	2025-10-31	下载	Multi-access Edge Computing (MEC) delivers low-latency services by hosting applications near end-users. To promote sustainability, these systems are increasingly integrated with renewable Energy Harve...
A Digital Twin-based Multi-Agent Reinforcement Learning Framework for Vehicle-to-Grid Coordination	Zhengchang Hua, Panagiotis Oikonomou, Karim Djemame, Nikos Tziritas, Georgios Theodoropoulos	2025-10-31	下载	The coordination of large-scale, decentralised systems, such as a fleet of Electric Vehicles (EVs) in a Vehicle-to-Grid (V2G) network, presents a significant challenge for modern control systems.
Synergistic Tensor and Pipeline Parallelism	Mengshi Qi, Jiaxuan Peng, Jie Zhang, Juan Zhu, Yong Li, Huadong Ma	2025-10-31	下载	In the machine learning system, the hybrid model parallelism combining tensor parallelism (TP) and pipeline parallelism (PP) has become the dominant solution for distributed training of Large Language...
SERFLOW: A Cross-Service Cost Optimization Framework for SLO-Aware Dynamic ML Inference	Zongshun Zhang, Ibrahim Matta	2025-10-31	下载	Dynamic offloading of Machine Learning (ML) model partitions across different resource orchestration services, such as Function-as-a-Service (FaaS) and Infrastructure-as-a-Service (IaaS), can balance ...
Glia: A Human-Inspired AI for Automated Systems Design and Optimization	Pouya Hamadanian, Pantea Karimi, Arash Nasr-Esfahany, Kimia Noorbakhsh, Joseph Chandler, Ali ParandehGheibi, Mohammad Alizadeh, Hari Balakrishnan	2025-10-31	下载	Can AI autonomously design mechanisms for computer systems on par with the creativity and reasoning of human experts? We present Glia, an AI architecture for networked systems design that uses large l...
Byzantine Attacks in RIS-Enhanced Cooperative Spectrum Sensing: A Decision Fusion Perspective	Gaoyuan Zhang, Gaolei Song, Boyuan Li, Zijian Li, Baofeng Ji, Ruijuan Zheng, Guoqiang Zheng, Tony Q. S. Quek	2025-10-31	下载	From the perspective of hard decision fusion, we investigate Byzantine attacks in Reconfigurable Intelligent Surface (RIS)-enhanced and decode-and-forward relay-assisted Cooperative Spectrum Sensing (...
Secure Communication in the Presence of an RIS-Enhanced Eavesdropper in MIMO Networks	Gaoyuan Zhang, Ruisong Si, Boyuan Li, Zijian Li, Baofeng Ji, Chenqi Zhu, Tony Q. S. Quek	2025-10-31	下载	We pay our attention towards secure and robust communication in the presence of a Reconfigurable Intelligent Surface (RIS)-enhanced mobile eavesdropping attacker in Multiple-Input Multiple-Output (MIM...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Tetris: An SLA-aware Application Placement Strategy in the Edge-Cloud Continuum	Lucas Almeida, Maycon Peixoto	2025-10-31	下载	An Edge-Cloud Continuum integrates edge and cloud resources to provide a flexible and scalable infrastructure. This paradigm can minimize latency by processing data closer to the source at the edge wh...
Learning a Network Digital Twin as a Hybrid System	Christos Mavridis, Fernando S. Barbosa, Hamed Farhadi, Karl H. Johansson	2025-10-31	下载	Network digital twin (NDT) models are virtual models that replicate the behavior of physical communication networks and are considered a key technology component to enable novel features and capabilit...
Reinforcement Learning for Resource Allocation in Vehicular Multi-Fog Computing	Mohammad Hadi Akbarzadeh, Mahmood Ahmadi, Mohammad Saeed Jahangiry, Jae Young Hur	2025-10-31	下载	The exponential growth of Internet of Things (IoT) devices, smart vehicles, and latency-sensitive applications has created an urgent demand for efficient distributed computing paradigms.
Mist-Assisted Federated Learning for Intrusion Detection in Heterogeneous IoT Networks	Saadat Izadi, Shakib Komasi, Ali Salimi, Alireza Rezaei, Mahmood Ahmadi	2025-10-31	下载	The rapid growth of the Internet of Things (IoT) offers new opportunities but also expands the attack surface of distributed, resource-limited devices.
Toward Hybrid COTS-based LiFi/WiFi Networks with QoS Requirements in Mobile Environments	Emilio Ancillotti, Loreto Pescosolido, Andrea Passarella	2025-10-31	下载	We consider a hybrid LiFi/WiFi network consisting of commercially available equipment, for mobile scenarios, where WiFi backs up communications, through vertical handovers, in case of insufficient LiF...
Towards Sub-millisecond Latency and Guaranteed Bit Rates in 5G User Plane	Leonardo Alberro, Noura Limam, Raouf Boutaba	2025-10-31	下载	Next-generation services demand stringent Quality of Service (QoS) guarantees, such as per-flow bandwidth assurance, ultra-low latency, and traffic prioritization, posing significant challenges to 5G ...
Rethinking Telemetry Design for Fine-Grained Anomaly Detection in 5G User Planes	Niloy Saha, Noura Limam, Yang Xiao, Raouf Boutaba	2025-10-31	下载	Detecting QoS anomalies in 5G user planes requires fine-grained per-flow visibility, but existing telemetry approaches face a fundamental trade-off.
Asynchronous Risk-Aware Multi-Agent Packet Routing for Ultra-Dense LEO Satellite Networks	Ke He, Thang X. Vu, Le He, Lisheng Fan, Symeon Chatzinotas, Bjorn Ottersten	2025-10-31	下载	The rise of ultra-dense LEO constellations creates a complex and asynchronous network environment, driven by their massive scale, dynamic topologies, and significant delays.
Challenging Tribal Knowledge -- Large Scale Measurement Campaign on Decentralized NAT Traversal	Dennis Trautwein, Cornelius Ihle, Moritz Schubotz, Bela Gipp	2025-10-31	下载	The promise of decentralized peer-to-peer (P2P) systems is fundamentally gated by the challenge of Network Address Translation (NAT) traversal, with existing solutions often reintroducing the very cen...
Selected Results from the REDMARS2 Project: Recursive Delay-Tolerant Networking using Bundle-in-Bundle Encapsulation	Marius Feldmann, Tobias Nöthlich, Felix Walter, Maximilian Nitsch, Juan A. Fraire, Georg A. Murzik, Fiona Fuchs	2025-10-31	下载	This whitepaper presents parts of the results of the REDMARS2 project conducted in 2021-2022, exploring the integration of Recursive Internetwork Architecture (RINA) concepts into Delay- and Disruptio...
Effective Delayed Patching for Transient Malware Control on Networks	Minh Phu Vuong, Chul-Ho Lee, Do Young Eun	2025-10-31	下载	Patching nodes is an effective network defense strategy for malware control at early stages, and its performance is primarily dependent on how accurately the infection propagation is characterized.
Study of Cluster-Based Routing Based on Machine Learning for UAV Networks in 6G	Luis Antonio L. F. da Costa, Rodrigo C. de Lamare, Rafael Kunst, Edison Pignaton de Freitas	2025-10-31	下载	The sixth generation (6G) wireless networks are envisioned to deliver ultra-low latency, massive connectivity, and high data rates, enabling advanced applications such as autonomous {unmaned aerial ve...
Stochastic Geometry of Cylinders: Characterizing Inter-Nodal Distances for 3D UAV Networks	Yunfeng Jiang, Zhiming Huang, Jianping Pan	2025-10-31	下载	The analytical characterization of coverage probability in finite three-dimensional wireless networks has long remained an open problem, hindered by the loss of spatial independence in finite-node set...
Analytical Model of NR-V2X Mode 2 with Re-Evaluation Mechanism	Shuo Zhu, Siyu Lin	2025-10-31	下载	Massive message transmissions, unpredictable aperiodic messages, and high-speed moving vehicles contribute to the complex wireless environment, resulting in inefficient resource collisions in Vehicle ...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Fix: externalizing network I/O in serverless computing	Yuhan Deng, Akshay Srivatsan, Sebastian Ingino, Francis Chua, Yasmine Mitchell, Matthew Vilaysack, Keith Winstein	2025-10-31	下载	We describe a system for serverless computing where users, programs, and the underlying platform share a common representation of a computation: a deterministic procedure, run in an environment ...
Supply Chain Exploitation of Secure ROS 2 Systems: A Proof-of-Concept on Autonomous Platform Compromise via Keystore Exfiltration	Tahmid Hasan Sakib, Yago Romano Martinez, Carter Brady, Syed Rafay Hasan, Terry N. Guo	2025-10-31	下载	This paper presents a proof-of-concept supply chain attack against the Secure ROS 2 (SROS 2) framework, demonstrated on a Quanser QCar2 autonomous vehicle platform.
Sockeye: a language for analyzing hardware documentation	Ben Fiedler, Samuel Gruetter, Timothy Roscoe	2025-10-31	下载	Systems programmers have to consolidate the ever growing hardware mess present on modern System-on-Chips (SoCs). Correctly programming a multitude of components, providing functionality but also secur...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
AMD MI300X GPU Performance Analysis	Chandrish Ambati, Trung Diep	2025-10-31	下载	The rapid growth of large language models (LLMs) has driven the need for high-performance, scalable GPU hardware capable of efficiently serving models with hundreds of billions of parameters.
Dependence-Driven, Scalable Quantum Circuit Mapping with Affine Abstractions	Marouane Benbetka, Merwan Bekkar, Riyadh Baghdadi, Martin Kong	2025-10-31	下载	Qubit Mapping is a critical task in Quantum Compilation, as modern Quantum Processing Units (QPUs) are constrained to nearest-neighbor interactions defined by a qubit coupling graph.
MLPerf Automotive	Radoyeh Shojaei, Predrag Djurdjevic, Mostafa El-Khamy, James Goel, Kasper Mecklenburg, John Owens, Pınar Muyan-Özçelik, Tom St. John, Jinho Suh, Arjun Suresh	2025-10-31	下载	We present MLPerf Automotive, the first standardized public benchmark for evaluating Machine Learning systems that are deployed for AI acceleration in automotive systems.