2025-07-16

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
CRAFT: Latency and Cost-Aware Genetic-Based Framework for Node Placement in Edge-Fog Environments	Soheil Mahdizadeh, Amir Mahdi Rasouli, Mohammad Pourashory, Sadra Galavani, Mohsen Ansari	2025-07-16	下载	Reducing latency in the Internet of Things (IoT) is a critical concern. While cloud computing facilitates communication, it falls short of meeting real-time requirements reliably.
Characterizing State Space Model and Hybrid Language Model Performance with Long Context	Saptarshi Mitra, Rachid Karami, Haocheng Xu, Sitao Huang, Hyoukjun Kwon	2025-07-16	下载	Emerging applications such as AR are driving demands for machine intelligence capable of processing continuous and/or long-context inputs on local devices.
High-Performance Pipelined NTT Accelerators with Homogeneous Digit-Serial Modulo Arithmetic	George Alexakis, Dimitrios Schoinianakis, Giorgos Dimitrakopoulos	2025-07-16	下载	The Number Theoretic Transform (NTT) is a fundamental operation in privacy-preserving technologies, particularly within fully homomorphic encryption (FHE).
Accelerating Transistor-Level Simulation of Integrated Circuits via Equivalence of RC Long-Chain Structures	Ruibai Tang, Wenlai Zhao	2025-07-16	下载	Transistor-level simulation plays a vital role in validating the physical correctness of integrated circuits. However, such simulations are computationally expensive.
Chain-of-Descriptions: Improving Code LLMs for VHDL Code Generation and Summarization	Prashanth Vijayaraghavan, Apoorva Nitsure, Charles Mackin, Luyao Shi, Stefano Ambrogio, Arvind Haran, Viresh Paruthi, Ali Elzein, Dan Coops, David Beymer, Tyler Baldwin, Ehsan Degan	2025-07-16	下载	Large Language Models (LLMs) have become widely used across diverse NLP tasks and domains, demonstrating their adaptability and effectiveness.
MOFCO: Mobility- and Migration-Aware Task Offloading in Three-Layer Fog Computing Environments	Soheil Mahdizadeh, Elyas Oustad, Mohsen Ansari	2025-07-16	下载	Task offloading in three-layer fog computing environments presents a critical challenge due to user equipment (UE) mobility, which frequently triggers costly service migrations and degrades overall sy...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
BootSeer: Analyzing and Mitigating Initialization Bottlenecks in Large-Scale LLM Training	Rui Li, Xiaoyun Zhi, Jinxin Chi, Menghan Yu, Lixin Huang, Jia Zhu, Weilun Zhang, Xing Ma, Wenjia Liu, Zhicheng Zhu, Daowen Luo, Zuquan Song, Xin Yin, Chao Xiang, Shuguang Wang, Wencong Xiao, Gene Cooperman	2025-07-16	下载	Large Language Models (LLMs) have become a cornerstone of modern AI, driving breakthroughs in natural language processing and expanding into multimodal jobs involving images, audio, and video.
Rel-HNN: Split Parallel Hypergraph Neural Network for Learning on Relational Databases	Md. Tanvir Alam, Md. Ahasanul Alam, Md Mahmudur Rahman, Md. Mosaddek Khan	2025-07-16	下载	Relational databases (RDBs) are ubiquitous in enterprise and real-world applications. Flattening the database poses challenges for deep learning models that rely on fixed-size input representations to...
CRAFT: Latency and Cost-Aware Genetic-Based Framework for Node Placement in Edge-Fog Environments	Soheil Mahdizadeh, Amir Mahdi Rasouli, Mohammad Pourashory, Sadra Galavani, Mohsen Ansari	2025-07-16	下载	Reducing latency in the Internet of Things (IoT) is a critical concern. While cloud computing facilitates communication, it falls short of meeting real-time requirements reliably.
Incentivised Orchestrated Training Architecture (IOTA): A Technical Primer for Release	Felix Quinque, Alan Aboudib, Szymon Fonau, Rodrigo Lopez Portillo Alcocer, Brian McCrindle, Steffen Cruz	2025-07-16	下载	In August 2024, Bittensor's Subnet 9 (SN9) demonstrated that a distributed network of incentivized, permissionless actors could each pretrain large language models (LLMs) ranging from 700 million to 1...
Toward Efficient SpMV in Sparse LLMs via Block Extraction and Compressed Storage	Junqing Lin, Jingwei Sun, Mingge Lu, Guangzhong Sun	2025-07-16	下载	Sparse Matrix-Vector Multiplication (SpMV) has become a critical performance bottleneck in the local deployment of sparse Large Language Models (LLMs), where inference predominantly operates on worklo...
Urban Green Governance: IoT-Driven Management and Enhancement of Urban Green Spaces in Campobasso	Antonio Salis, Gabriele Troina, Gianluca Boanelli, Marco Ottaviano, Paola Fortini, Soraya Versace	2025-07-16	下载	The efficient design and management of public green spaces is a key factor in promoting the health and well-being of urban population, as emphasized by the WHO, UNEP, and EEA.
Distributed Algorithms for Potential Problems	Alkida Balliu, Thomas Boudier, Francesco d'Amore, Fabian Kuhn, Dennis Olivetti, Gustav Schmid, Jukka Suomela	2025-07-16	下载	In this work, we present a fast distributed algorithm for local potential problems: these are graph problems where the task is to find a locally optimal solution where no node can unilaterally improve...
ARRC: Explainable, Workflow-Integrated Recommender for Sustainable Resource Optimization Across the Edge-Cloud Continuum	Brian-Frederik Jahnke, René Brinkhege, Jan Peter Meyer, Daniel Tebernum, Falk Howar	2025-07-16	下载	Achieving sustainable, explainable, and maintainable automation for resource optimization is a core challenge across the edge-cloud continuum.
MOFCO: Mobility- and Migration-Aware Task Offloading in Three-Layer Fog Computing Environments	Soheil Mahdizadeh, Elyas Oustad, Mohsen Ansari	2025-07-16	下载	Task offloading in three-layer fog computing environments presents a critical challenge due to user equipment (UE) mobility, which frequently triggers costly service migrations and degrades overall sy...
NineToothed: A Triton-Based High-Level Domain-Specific Language for Machine Learning	Jiacheng Huang, Zimin Li, Yinghui Li, Haojie Wang	2025-07-16	下载	The emergence of deep learning domain-specific languages (DSLs) has substantially reduced the obstacles in developing high-performance, cross-platform compute kernels.
BlockBPE: Parallel BPE Tokenization	Amos You	2025-07-16	下载	Tokenization is a critical preprocessing step in large language model pipelines, yet widely-used implementations remain CPU-bound and suboptimal for batch inference workflows on GPU.
Making Serverless Computing Extensible: A Case Study of Serverless Data Analytics	Minchen Yu, Yinghao Ren, Jiamu Zhao, Jiaqi Li	2025-07-16	下载	Serverless computing has attracted a broad range of applications due to its ease of use and resource elasticity. However, developing serverless applications often poses a dilemma -- relying on general...
A Parallel CPU-GPU Framework for Batching Heuristic Operations in Depth-First Heuristic Search	Ehsan Futuhi, Nathan R. Sturtevant	2025-07-16	下载	The rapid advancement of GPU technology has unlocked powerful parallel processing capabilities, creating new opportunities to enhance classic search algorithms.
Performance Cost Tradeoffs in Intelligent Load Balancing for Multi Data Center Cloud Systems: From Static Policies to Adaptive Resource Distribution	Saeid Aghasoleymani Najafabadi, Elaheh Nabavi Nia	2025-07-16	下载	Cloud computing infrastructures increasingly rely on geographically distributed data centers to meet the growing demand for low latency, high availability, and cost-efficient service delivery.
Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI	Samyam Rajbhandari, Mert Hidayetoglu, Aurick Qiao, Ye Wang, Juncheng Yang, Jeff Rasley, Michael Wyatt, Yuxiong He	2025-07-16	下载	Inference is now the dominant AI workload, yet existing systems force trade-offs between latency, throughput, and cost. Arctic Inference, an open-source vLLM plugin from Snowflake AI Research, introdu...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
CRAFT: Latency and Cost-Aware Genetic-Based Framework for Node Placement in Edge-Fog Environments	Soheil Mahdizadeh, Amir Mahdi Rasouli, Mohammad Pourashory, Sadra Galavani, Mohsen Ansari	2025-07-16	下载	Reducing latency in the Internet of Things (IoT) is a critical concern. While cloud computing facilitates communication, it falls short of meeting real-time requirements reliably.
LLM-Based Config Synthesis requires Disambiguation	Rajdeep Mondal, Nikolaj Bjorner, Todd Millstein, Alan Tang, George Varghese	2025-07-16	下载	Beyond hallucinations, another problem in program synthesis using LLMs is ambiguity in user intent. We illustrate the ambiguity problem in a networking context for LLM-based incremental configuration ...
FastReChain: A Novel Bidirectional Model-Based Algorithm for Topology Engineering of OCS-Based Clusters	Zihan Zhu, Xinchi Han, Dongchao Wu, Zhanbang Zhang, Jian Yang, Shizhen Zhao, Xinbing Wang	2025-07-16	下载	Optical Circuit Switching (OCS) technology is increasingly being adopted in data centers due to its advantages of low power consumption and low technology refresh costs.
MOFCO: Mobility- and Migration-Aware Task Offloading in Three-Layer Fog Computing Environments	Soheil Mahdizadeh, Elyas Oustad, Mohsen Ansari	2025-07-16	下载	Task offloading in three-layer fog computing environments presents a critical challenge due to user equipment (UE) mobility, which frequently triggers costly service migrations and degrades overall sy...
AI-Native Open RAN for Non-Terrestrial Networks: An Overview	Jikang Deng, S. Fizza Hassan, Hui Zhou, Saad Al-Ahmadi, Mohamed-Slim Alouini, Daniel B. Da Costa	2025-07-16	下载	Non-terrestrial network (NTN) is envisioned as a critical component of Sixth Generation (6G) networks by enabling ubiquitous services and enhancing network resilience.
Extremal Testing for Network Software using LLMs	Rathin Singha, Harry Qian, Srinath Saikrishnan, Tracy Zhao, Ryan Beckett, Siva Kesava Reddy Kakarla, George Varghese	2025-07-16	下载	Physicists often manually consider extreme cases when testing a theory. In this paper, we show how to automate extremal testing of network software using LLMs in two steps: first, ask the LLM to gener...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Rethinking the confidential cloud through a unified low-level abstraction for composable isolation	Adrien Ghosn, Charly Castes, Neelu S. Kalani, Yuchen Qian, Marios Kogias, Edouard Bugnion	2025-07-16	下载	Securing sensitive cloud workloads requires composing confidential virtual machines (CVMs) with nested enclaves or sandboxes. Unfortunately, each new isolation boundary adds ad-hoc access control mech...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Accelerating Transistor-Level Simulation of Integrated Circuits via Equivalence of RC Long-Chain Structures	Ruibai Tang, Wenlai Zhao	2025-07-16	下载	Transistor-level simulation plays a vital role in validating the physical correctness of integrated circuits. However, such simulations are computationally expensive.
BenchRL-QAS: Benchmarking reinforcement learning algorithms for quantum architecture search	Azhar Ikhtiarudin, Aditi Das, Param Thakkar, Akash Kundu	2025-07-16	下载	We present BenchRL-QAS, a unified benchmarking framework for reinforcement learning (RL) in quantum architecture search (QAS) across a spectrum of variational quantum algorithm tasks on 2- to 8-qubit ...
Kevin: Multi-Turn RL for Generating CUDA Kernels	Carlo Baronio, Pietro Marsella, Ben Pan, Simon Guo, Silas Alberti	2025-07-16	下载	Writing GPU kernels is a challenging task and critical for AI systems' efficiency. It is also highly iterative: domain experts write code and improve performance through execution feedback.