Appearance
2025-07-18
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| LIMINAL: Exploring The Frontiers of LLM Decode Performance | Michael Davies, Neal Crago, Karthikeyan Sankaralingam, Christos Kozyrakis | 2025-07-18 | 下载 | The rapid advancement of Large Language Models (LLMs) necessitates a deep understanding of their fundamental performance limits. This paper investigates the limits of LLM inference, focusing on hardwa... |
| An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC | Matthias Jobst, Tim Langer, Chen Liu, Mehmet Alici, Hector A. Gonzalez, Christian Mayr | 2025-07-18 | 下载 | This work presents a multi-layer DNN scheduling framework as an extension of OctopuScheduler, providing an end-to-end flow from PyTorch models to inference on a single SpiNNaker2 chip. |
| 4T2R X-ReRAM CiM Array for Variation-tolerant, Low-power, Massively Parallel MAC Operation | Fuyuki Kihara, Seiji Uenohara, Satoshi Awamura, Naoko Misawa, Chihiro Matsui, Ken Takeuchi | 2025-07-18 | 下载 | Computation-in-Memory (CiM) is attracting attention as a technology that can perform MAC calculations required for AI accelerators, at high speed with low power consumption. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Characterizing Communication Patterns in Distributed Large Language Model Inference | Lang Xu, Kaushik Kandadi Suresh, Quentin Anthony, Nawras Alnaasan, Dhabaleswar K. Panda | 2025-07-18 | 下载 | Large Language Models (LLMs) built on transformer architectures have transformed natural language processing, achieving remarkable performance across diverse applications. |
| FedStrategist: A Meta-Learning Framework for Adaptive and Robust Aggregation in Federated Learning | Md Rafid Haque, Abu Raihan Mostofa Kamal, Md. Azam Hossain | 2025-07-18 | 下载 | Federated Learning (FL) offers a paradigm for privacy-preserving collaborative AI, but its decentralized nature creates significant vulnerabilities to model poisoning attacks. |
| Weighted Matching in a Poly-Streaming Model | Ahammed Ullah, S. M. Ferdous, Alex Pothen | 2025-07-18 | 下载 | We introduce the poly-streaming model, a generalization of streaming models of computation in which processors process data streams containing a total of items. |
| CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning | Xiaoya Li, Xiaofei Sun, Albert Wang, Jiwei Li, Chris Shum | 2025-07-18 | 下载 | The exponential growth in demand for GPU computing resources has created an urgent need for automated CUDA optimization strategies. While recent advances in LLMs show promise for code generation, curr... |
| Shipwright: Proving liveness of distributed systems with Byzantine participants | Derek Leung, Nickolai Zeldovich, Frans Kaashoek | 2025-07-18 | 下载 | Ensuring liveness in a decentralized system, such as PBFT, is critical, because there may not be any single administrator that can restart the system if it encounters a liveness bug. |
| Edge Intelligence with Spiking Neural Networks | Shuiguang Deng, Di Yu, Changze Lv, Xin Du, Linshan Jiang, Xiaofan Zhao, Wentao Tong, Xiaoqing Zheng, Weijia Fang, Peng Zhao, Gang Pan, Schahram Dustdar, Albert Y. Zomaya | 2025-07-18 | 下载 | The convergence of artificial intelligence and edge computing has spurred growing interest in enabling intelligent services directly on resource-constrained devices. |
| Application Placement with Constraint Relaxation | Damiano Azzolini, Marco Duca, Stefano Forti, Francesco Gallo, Antonio Ielo | 2025-07-18 | 下载 | Novel utility computing paradigms rely upon the deployment of multi-service applications to pervasive and highly distributed cloud-edge infrastructure resources. |
| DistFlow: A Fully Distributed RL Framework for Scalable and Efficient LLM Post-Training | Zhixin Wang, Tianyi Zhou, Liming Liu, Ao Li, Jiarui Hu, Dian Yang, Yinhui Lu, Jinlong Hou, Siyuan Feng, Yuan Cheng, Yuan Qi | 2025-07-18 | 下载 | Reinforcement learning (RL) has become the pivotal post-training technique for large language model (LLM). Effectively scaling reinforcement learning is now the key to unlocking advanced reasoning cap... |
| An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC | Matthias Jobst, Tim Langer, Chen Liu, Mehmet Alici, Hector A. Gonzalez, Christian Mayr | 2025-07-18 | 下载 | This work presents a multi-layer DNN scheduling framework as an extension of OctopuScheduler, providing an end-to-end flow from PyTorch models to inference on a single SpiNNaker2 chip. |
| Quantum Blockchain Survey: Foundations, Trends, and Gaps | Saurav Ghosh, Niloy Deb Roy Mishu | 2025-07-18 | 下载 | Quantum computing poses fundamental risks to classical blockchain systems by undermining widely used cryptographic primitives. In response, two major research directions have emerged: post-quantum blo... |
| FedSkipTwin: Digital-Twin-Guided Client Skipping for Communication-Efficient Federated Learning | Daniel Commey, Kamel Abbad, Garth V. Crosby, Lyes Khoukhi | 2025-07-18 | 下载 | Communication overhead remains a primary bottleneck in federated learning (FL), particularly for applications involving mobile and IoT devices with constrained bandwidth. |
| Leveraging Multi-Instance GPUs through moldable task scheduling | Jorge Villarrubia, Luis Costero, Francisco D. Igual, Katzalin Olcoz | 2025-07-18 | 下载 | NVIDIA MIG (Multi-Instance GPU) allows partitioning a physical GPU into multiple logical instances with fully-isolated resources, which can be dynamically reconfigured. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| NetIntent: Leveraging Large Language Models for End-to-End Intent-Based SDN Automation | Md. Kamrul Hossain, Walid Aljoby | 2025-07-18 | 下载 | Intent-Based Networking (IBN) often leverages the programmability of Software-Defined Networking (SDN) to simplify network management. However, significant challenges remain in automating the entire p... |
| The Proportional Fair Scheduler in Wavelength-Multiplexed Quantum Networks | Sanidhay Bhambay, Siddarth Koduru Joshi, Thirupathaiah Vasantam, Neil Walton | 2025-07-18 | 下载 | We address the problem of optimal pumping strategies in quantum networks. These networks enable secure communication by distributing entangled photon pairs to user (or node) pairs. |
| Preprint: Poster: Did I Just Browse A Website Written by LLMs? | Sichang Steven He, Ramesh Govindan, Harsha V. Madhyastha | 2025-07-18 | 下载 | Increasingly, web content is automatically generated by large language models (LLMs) with little human input. We call this "LLM-dominant" content. |
| Beyond DNS: Unlocking the Internet of AI Agents via the NANDA Index and Verified AgentFacts | Ramesh Raskar, Pradyumna Chari, John Zinky, Mahesh Lambe, Jared James Grogan, Sichao Wang, Rajesh Ranjan, Rekha Singhal, Shailja Gupta, Robert Lincourt, Raghu Bala, Aditi Joshi, Abhishek Singh, Ayush Chopra, Dimitris Stripelis, Bhuwan B, Sumit Kumar, Maria Gorskikh | 2025-07-18 | 下载 | The Internet is poised to host billions to trillions of autonomous AI agents that negotiate, delegate, and migrate in milliseconds and workloads that will strain DNS-centred identity and discovery. |
| On the Trade-Off Between Sum-Rate and Energy Efficiency through the Convergence of HAPS and Active RIS Technologies | Bilal Karaman, Ilhan Basturk, Ferdi Kara, Metin Ozturk, Sezai Taskin, Halil Yanikomeroglu | 2025-07-18 | 下载 | This paper investigates the integration of active reconfigurable intelligent surfaces (RIS) relay with high-altitude platform stations (HAPS) to enhance non-terrestrial network (NTN) performance in ne... |
| Quantum Blockchain Survey: Foundations, Trends, and Gaps | Saurav Ghosh, Niloy Deb Roy Mishu | 2025-07-18 | 下载 | Quantum computing poses fundamental risks to classical blockchain systems by undermining widely used cryptographic primitives. In response, two major research directions have emerged: post-quantum blo... |
| ATRO: A Fast Algorithm for Topology Engineering of Reconfigurable Datacenter Networks | Yingming Mao, Qiaozhu Zhai, Ximeng Liu, Xinchi Han, Fafan li, Shizhen Zhao, Yuzhou Zhou, Zhen Yao, Xia Zhu | 2025-07-18 | 下载 | Reconfigurable data center networks (DCNs) enhance traditional architectures with optical circuit switches (OCSs), enabling dynamic reconfiguration of inter-pod links, i.e., the logical topology. |
| CARTS: Cooperative and Adaptive Resource Triggering and Stitching for 5G ISAC | Cheng Jiang, Yihe Yan, Yanxiang Wang, Jiawei Hu, Chun Tung Chou, Wen Hu | 2025-07-18 | 下载 | This paper presents CARTS, an adaptive 5G uplink sensing scheme designed to provide Integrated Sensing and Communication (ISAC) services. The performance of both communication and sensing fundamentall... |
| Agent Network Protocol Technical White Paper | Gaowei Chang, Eidan Lin, Chengxuan Yuan, Rizhao Cai, Binbin Chen, Xuan Xie, Yin Zhang | 2025-07-18 | 下载 | With the development of large models and autonomous decision-making AI, agents are rapidly becoming the new entities of the internet, following mobile apps. |
| FedSkipTwin: Digital-Twin-Guided Client Skipping for Communication-Efficient Federated Learning | Daniel Commey, Kamel Abbad, Garth V. Crosby, Lyes Khoukhi | 2025-07-18 | 下载 | Communication overhead remains a primary bottleneck in federated learning (FL), particularly for applications involving mobile and IoT devices with constrained bandwidth. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Photonic Fabric Platform for AI Accelerators | Jing Ding, Trung Diep | 2025-07-18 | 下载 | This paper presents the Photonic FabricTM and the Photonic Fabric ApplianceTM (PFA), a photonic-enabled switch and memory subsystem that delivers low latency, high bandwidth, and low per-bit energy. |
| The Proportional Fair Scheduler in Wavelength-Multiplexed Quantum Networks | Sanidhay Bhambay, Siddarth Koduru Joshi, Thirupathaiah Vasantam, Neil Walton | 2025-07-18 | 下载 | We address the problem of optimal pumping strategies in quantum networks. These networks enable secure communication by distributing entangled photon pairs to user (or node) pairs. |
| Leveraging Multi-Instance GPUs through moldable task scheduling | Jorge Villarrubia, Luis Costero, Francisco D. Igual, Katzalin Olcoz | 2025-07-18 | 下载 | NVIDIA MIG (Multi-Instance GPU) allows partitioning a physical GPU into multiple logical instances with fully-isolated resources, which can be dynamically reconfigured. |