2025-07-18

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
LIMINAL: Exploring The Frontiers of LLM Decode Performance	Michael Davies, Neal Crago, Karthikeyan Sankaralingam, Christos Kozyrakis	2025-07-18	下载	The rapid advancement of Large Language Models (LLMs) necessitates a deep understanding of their fundamental performance limits. This paper investigates the limits of LLM inference, focusing on hardwa...
An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC	Matthias Jobst, Tim Langer, Chen Liu, Mehmet Alici, Hector A. Gonzalez, Christian Mayr	2025-07-18	下载	This work presents a multi-layer DNN scheduling framework as an extension of OctopuScheduler, providing an end-to-end flow from PyTorch models to inference on a single SpiNNaker2 chip.
4T2R X-ReRAM CiM Array for Variation-tolerant, Low-power, Massively Parallel MAC Operation	Fuyuki Kihara, Seiji Uenohara, Satoshi Awamura, Naoko Misawa, Chihiro Matsui, Ken Takeuchi	2025-07-18	下载	Computation-in-Memory (CiM) is attracting attention as a technology that can perform MAC calculations required for AI accelerators, at high speed with low power consumption.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Characterizing Communication Patterns in Distributed Large Language Model Inference	Lang Xu, Kaushik Kandadi Suresh, Quentin Anthony, Nawras Alnaasan, Dhabaleswar K. Panda	2025-07-18	下载	Large Language Models (LLMs) built on transformer architectures have transformed natural language processing, achieving remarkable performance across diverse applications.
FedStrategist: A Meta-Learning Framework for Adaptive and Robust Aggregation in Federated Learning	Md Rafid Haque, Abu Raihan Mostofa Kamal, Md. Azam Hossain	2025-07-18	下载	Federated Learning (FL) offers a paradigm for privacy-preserving collaborative AI, but its decentralized nature creates significant vulnerabilities to model poisoning attacks.
Weighted Matching in a Poly-Streaming Model	Ahammed Ullah, S. M. Ferdous, Alex Pothen	2025-07-18	下载	We introduce the poly-streaming model, a generalization of streaming models of computation in which $k$ processors process $k$ data streams containing a total of $N$ items.
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning	Xiaoya Li, Xiaofei Sun, Albert Wang, Jiwei Li, Chris Shum	2025-07-18	下载	The exponential growth in demand for GPU computing resources has created an urgent need for automated CUDA optimization strategies. While recent advances in LLMs show promise for code generation, curr...
Shipwright: Proving liveness of distributed systems with Byzantine participants	Derek Leung, Nickolai Zeldovich, Frans Kaashoek	2025-07-18	下载	Ensuring liveness in a decentralized system, such as PBFT, is critical, because there may not be any single administrator that can restart the system if it encounters a liveness bug.
Edge Intelligence with Spiking Neural Networks	Shuiguang Deng, Di Yu, Changze Lv, Xin Du, Linshan Jiang, Xiaofan Zhao, Wentao Tong, Xiaoqing Zheng, Weijia Fang, Peng Zhao, Gang Pan, Schahram Dustdar, Albert Y. Zomaya	2025-07-18	下载	The convergence of artificial intelligence and edge computing has spurred growing interest in enabling intelligent services directly on resource-constrained devices.
Application Placement with Constraint Relaxation	Damiano Azzolini, Marco Duca, Stefano Forti, Francesco Gallo, Antonio Ielo	2025-07-18	下载	Novel utility computing paradigms rely upon the deployment of multi-service applications to pervasive and highly distributed cloud-edge infrastructure resources.
DistFlow: A Fully Distributed RL Framework for Scalable and Efficient LLM Post-Training	Zhixin Wang, Tianyi Zhou, Liming Liu, Ao Li, Jiarui Hu, Dian Yang, Yinhui Lu, Jinlong Hou, Siyuan Feng, Yuan Cheng, Yuan Qi	2025-07-18	下载	Reinforcement learning (RL) has become the pivotal post-training technique for large language model (LLM). Effectively scaling reinforcement learning is now the key to unlocking advanced reasoning cap...
An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC	Matthias Jobst, Tim Langer, Chen Liu, Mehmet Alici, Hector A. Gonzalez, Christian Mayr	2025-07-18	下载	This work presents a multi-layer DNN scheduling framework as an extension of OctopuScheduler, providing an end-to-end flow from PyTorch models to inference on a single SpiNNaker2 chip.
Quantum Blockchain Survey: Foundations, Trends, and Gaps	Saurav Ghosh, Niloy Deb Roy Mishu	2025-07-18	下载	Quantum computing poses fundamental risks to classical blockchain systems by undermining widely used cryptographic primitives. In response, two major research directions have emerged: post-quantum blo...
FedSkipTwin: Digital-Twin-Guided Client Skipping for Communication-Efficient Federated Learning	Daniel Commey, Kamel Abbad, Garth V. Crosby, Lyes Khoukhi	2025-07-18	下载	Communication overhead remains a primary bottleneck in federated learning (FL), particularly for applications involving mobile and IoT devices with constrained bandwidth.
Leveraging Multi-Instance GPUs through moldable task scheduling	Jorge Villarrubia, Luis Costero, Francisco D. Igual, Katzalin Olcoz	2025-07-18	下载	NVIDIA MIG (Multi-Instance GPU) allows partitioning a physical GPU into multiple logical instances with fully-isolated resources, which can be dynamically reconfigured.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
NetIntent: Leveraging Large Language Models for End-to-End Intent-Based SDN Automation	Md. Kamrul Hossain, Walid Aljoby	2025-07-18	下载	Intent-Based Networking (IBN) often leverages the programmability of Software-Defined Networking (SDN) to simplify network management. However, significant challenges remain in automating the entire p...
The Proportional Fair Scheduler in Wavelength-Multiplexed Quantum Networks	Sanidhay Bhambay, Siddarth Koduru Joshi, Thirupathaiah Vasantam, Neil Walton	2025-07-18	下载	We address the problem of optimal pumping strategies in quantum networks. These networks enable secure communication by distributing entangled photon pairs to user (or node) pairs.
Preprint: Poster: Did I Just Browse A Website Written by LLMs?	Sichang Steven He, Ramesh Govindan, Harsha V. Madhyastha	2025-07-18	下载	Increasingly, web content is automatically generated by large language models (LLMs) with little human input. We call this "LLM-dominant" content.
Beyond DNS: Unlocking the Internet of AI Agents via the NANDA Index and Verified AgentFacts	Ramesh Raskar, Pradyumna Chari, John Zinky, Mahesh Lambe, Jared James Grogan, Sichao Wang, Rajesh Ranjan, Rekha Singhal, Shailja Gupta, Robert Lincourt, Raghu Bala, Aditi Joshi, Abhishek Singh, Ayush Chopra, Dimitris Stripelis, Bhuwan B, Sumit Kumar, Maria Gorskikh	2025-07-18	下载	The Internet is poised to host billions to trillions of autonomous AI agents that negotiate, delegate, and migrate in milliseconds and workloads that will strain DNS-centred identity and discovery.
On the Trade-Off Between Sum-Rate and Energy Efficiency through the Convergence of HAPS and Active RIS Technologies	Bilal Karaman, Ilhan Basturk, Ferdi Kara, Metin Ozturk, Sezai Taskin, Halil Yanikomeroglu	2025-07-18	下载	This paper investigates the integration of active reconfigurable intelligent surfaces (RIS) relay with high-altitude platform stations (HAPS) to enhance non-terrestrial network (NTN) performance in ne...
Quantum Blockchain Survey: Foundations, Trends, and Gaps	Saurav Ghosh, Niloy Deb Roy Mishu	2025-07-18	下载	Quantum computing poses fundamental risks to classical blockchain systems by undermining widely used cryptographic primitives. In response, two major research directions have emerged: post-quantum blo...
ATRO: A Fast Algorithm for Topology Engineering of Reconfigurable Datacenter Networks	Yingming Mao, Qiaozhu Zhai, Ximeng Liu, Xinchi Han, Fafan li, Shizhen Zhao, Yuzhou Zhou, Zhen Yao, Xia Zhu	2025-07-18	下载	Reconfigurable data center networks (DCNs) enhance traditional architectures with optical circuit switches (OCSs), enabling dynamic reconfiguration of inter-pod links, i.e., the logical topology.
CARTS: Cooperative and Adaptive Resource Triggering and Stitching for 5G ISAC	Cheng Jiang, Yihe Yan, Yanxiang Wang, Jiawei Hu, Chun Tung Chou, Wen Hu	2025-07-18	下载	This paper presents CARTS, an adaptive 5G uplink sensing scheme designed to provide Integrated Sensing and Communication (ISAC) services. The performance of both communication and sensing fundamentall...
Agent Network Protocol Technical White Paper	Gaowei Chang, Eidan Lin, Chengxuan Yuan, Rizhao Cai, Binbin Chen, Xuan Xie, Yin Zhang	2025-07-18	下载	With the development of large models and autonomous decision-making AI, agents are rapidly becoming the new entities of the internet, following mobile apps.
FedSkipTwin: Digital-Twin-Guided Client Skipping for Communication-Efficient Federated Learning	Daniel Commey, Kamel Abbad, Garth V. Crosby, Lyes Khoukhi	2025-07-18	下载	Communication overhead remains a primary bottleneck in federated learning (FL), particularly for applications involving mobile and IoT devices with constrained bandwidth.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Photonic Fabric Platform for AI Accelerators	Jing Ding, Trung Diep	2025-07-18	下载	This paper presents the Photonic FabricTM and the Photonic Fabric ApplianceTM (PFA), a photonic-enabled switch and memory subsystem that delivers low latency, high bandwidth, and low per-bit energy.
The Proportional Fair Scheduler in Wavelength-Multiplexed Quantum Networks	Sanidhay Bhambay, Siddarth Koduru Joshi, Thirupathaiah Vasantam, Neil Walton	2025-07-18	下载	We address the problem of optimal pumping strategies in quantum networks. These networks enable secure communication by distributing entangled photon pairs to user (or node) pairs.
Leveraging Multi-Instance GPUs through moldable task scheduling	Jorge Villarrubia, Luis Costero, Francisco D. Igual, Katzalin Olcoz	2025-07-18	下载	NVIDIA MIG (Multi-Instance GPU) allows partitioning a physical GPU into multiple logical instances with fully-isolated resources, which can be dynamically reconfigured.