Skip to content

2025-10-19

cs.AR - Architecture

标题作者发布日期PDF摘要
Architect in the Loop Agentic Hardware Design and VerificationMubarek Mohammed2025-10-19下载The ever increasing complexity of the hardware design process demands improved hardware design and verification methodologies. With the advent of generative AI various attempts have been made to autom...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Justitia: Fair and Efficient Scheduling of Task-parallel LLM Agents with Selective PamperingMingyan Yang, Guanjie Wang, Manqi Luo, Yifei Liu, Chen Chen, Han Zhao, Yu Feng, Quan Chen, Minyi Guo2025-10-19下载LLM agents, which often comprise parallel inference tasks, are commonly adopted to solve real-world problems. When serving such task-parallel LLM agents in shared GPU servers, the scheduler is expecte...
Host-Side Telemetry for Performance Diagnosis in Cloud and HPC GPU InfrastructureErfan Darzi, Aldo Pareja, Shreeanant Bharadwaj2025-10-19下载Diagnosing GPU tail latency spikes in cloud and HPC infrastructure is critical for maintaining performance predictability and resource utilization, yet existing monitoring tools lack the granularity f...
Tutoring LLM into a Better CUDA OptimizerMatyáš Brabec, Jiří Klepl, Michal Töpfer, Martin Kruliš2025-10-19下载Recent leaps in large language models (LLMs) caused a revolution in programming tools (like GitHub Copilot) that can help with code generation, debugging, and even performance optimization.
FTI-TMR: A Fault Tolerance and Isolation Algorithm for Interconnected Multicore SystemsYiming Hu2025-10-19下载Two-Phase TMR conserves energy by partitioning redundancy operations into two stages and making the execution of the third task copy optional, yet it remains susceptible to permanent faults.
Layout-Agnostic MPI Abstraction for Distributed Computing in Modern C++Jiří Klepl, Martin Kruliš, Matyáš Brabec2025-10-19下载Message Passing Interface (MPI) has been a well-established technology in the domain of distributed high-performance computing for several decades.
DiRAC - Distributed Robot Awareness and ConsensusUday Gopan, Manjari Kulkarni, Lakshasri S, Kashish Mittal, Sriram Radhakrishna, Aditya Naskar, Rameshwar DL2025-10-19下载DiRAC is a scalable, distributed framework designed to enable efficient task assignment and path planning in very large robotic swarms. It introduces a novel zone-partitioned architecture with dynamic...
The Sherpa.ai Blind Vertical Federated Learning Paradigm to Minimize the Number of CommunicationsAlex Acero, Daniel M. Jimenez-Gutierrez, Dario Pighin, Enrique Zuazua, Joaquin Del Rio, Xabi Uribe-Etxebarria2025-10-19下载Federated Learning (FL) enables collaborative decentralized training across multiple parties (nodes) while keeping raw data private. There are two main paradigms in FL: Horizontal FL (HFL), where all ...
Exact Nearest-Neighbor Search on Energy-Efficient FPGA DevicesPatrizio Dazzi, William Guglielmo, Franco Maria Nardini, Raffaele Perego, Salvatore Trani2025-10-19下载This paper investigates the usage of FPGA devices for energy-efficient exact kNN search in high-dimension latent spaces. This work intercepts a relevant trend that tries to support the increasing popu...
CLIP: Client-Side Invariant Pruning for Mitigating Stragglers in Secure Federated LearningAnthony DiMaggio, Raghav Sharma, Gururaj Saileshwar2025-10-19下载Secure federated learning (FL) preserves data privacy during distributed model training. However, deploying such frameworks across heterogeneous devices results in performance bottlenecks, due to stra...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Traffic Prioritization Mechanisms for Mission and Time Critical Applications in Industrial Internet of ThingsAnwar Ahmed Khan, Shama Siddiqui, Indrakshi Dey2025-10-19下载Industrial Internet of Things (IIoT) promises to revolutionize industrial operations and productions through utilizing Machine-to-Machine (M2M) communications.

基于 VitePress 构建