Appearance
2025-05-06
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| QiMeng-CPU-v2: Automated Superscalar Processor Design by Learning Data Dependencies | Shuyao Cheng, Rui Zhang, Wenkai He, Pengwei Jin, Chongxiao Li, Zidong Du, Xing Hu, Yifan Hao, Guanglin Xu, Yuanbo Wen, Ling Li, Qi Guo, Yunji Chen | 2025-05-06 | 下载 | Automated processor design, which can significantly reduce human efforts and accelerate design cycles, has received considerable attention. While recent advancements have automatically designed single... |
| Hardware vs. Software Implementation of Warp-Level Features in Vortex RISC-V GPU | Huanzhi Pu, Rishabh Ravi, Shinnung Jeong, Udit Subramanya, Euijun Chung, Jisheng Zhao, Chihyo Ahn, Hyesoon Kim | 2025-05-06 | 下载 | RISC-V GPUs present a promising path for supporting GPU applications. Traditionally, GPUs achieve high efficiency through the SPMD (Single Program Multiple Data) programming model. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Prism: Unleashing GPU Sharing for Cost-Efficient Multi-LLM Serving | Shan Yu, Jiarong Xing, Yifan Qiao, Mingyuan Ma, Yangmin Li, Yang Wang, Shuo Yang, Zhiqiang Xie, Shiyi Cao, Ke Bao, Ion Stoica, Harry Xu, Ying Sheng | 2025-05-06 | 下载 | Serving large language models (LLMs) is expensive, especially for providers hosting many models, making cost reduction essential. The unique workload patterns of serving multiple LLMs (i.e. |
| Rollbaccine : Herd Immunity against Storage Rollback Attacks in TEEs [Technical Report] | David Chu, Aditya Balasubramanian, Dee Bao, Natacha Crooks, Heidi Howard, Lucky E. Katahanas, Soujanya Ponnapalli | 2025-05-06 | 下载 | Today, users can "lift-and-shift" unmodified applications into modern, VM-based Trusted Execution Environments (TEEs) in order to gain hardware-based security guarantees. |
| Can Large Language Models Predict Parallel Code Performance? | Gregory Bolet, Giorgis Georgakoudis, Harshitha Menon, Konstantinos Parasyris, Niranjan Hasabnis, Hayden Estes, Kirk W. Cameron, Gal Oren | 2025-05-06 | 下载 | Accurate determination of the performance of parallel GPU code typically requires execution-time profiling on target hardware -- an increasingly prohibitive step due to limited access to high-end GPUs... |
| Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems | Matthew Sgambati, Aleksandar Vakanski, Matthew Anderson | 2025-05-06 | 下载 | Resource allocation in High Performance Computing (HPC) environments presents a complex and multifaceted challenge for job scheduling algorithms. |
| MARCO: Multi-Agent Code Optimization with Real-Time Knowledge Integration for High-Performance Computing | Asif Rahman, Veljko Cvetkovic, Kathleen Reece, Aidan Walters, Yasir Hassan, Aneesh Tummeti, Bryan Torres, Denise Cooney, Margaret Ellis, Dimitrios S. Nikolopoulos | 2025-05-06 | 下载 | Large language models (LLMs) have transformed software development through code generation capabilities, yet their effectiveness for high-performance computing (HPC) remains limited. |
| Decentralized Nonconvex Optimization under Heavy-Tailed Noise: Normalization and Optimal Convergence | Shuhua Yu, Dusan Jakovetic, Soummya Kar | 2025-05-06 | 下载 | Heavy-tailed noise in nonconvex stochastic optimization has garnered increasing research interest, as empirical studies, including those on training attention models, suggest it is a more realistic gr... |
| Revisiting Lower Bounds for Two-Step Consensus | Fedor Ryabinin, Alexey Gotsman, Pierre Sutra | 2025-05-06 | 下载 | A seminal result by Lamport shows that at least processes are required to implement partially synchronous consensus that tolerates process failures and can furthermore decide... |
| TailBench++: Flexible Multi-Client, Multi-Server Benchmarking for Latency-Critical Workloads | Zhilin Li, Lucia Pons, Salvador Petit, Julio Sahuquillo, Julio Pons | 2025-05-06 | 下载 | Cloud systems have rapidly expanded worldwide in the last decade, shifting computational tasks to cloud servers where clients submit their requests. |
| A Hashgraph-Inspired Consensus Mechanism for Reliable Multi-Model Reasoning | Kolawole E. Ogunsina, Morayo A. Ogunsina | 2025-05-06 | 下载 | Inconsistent outputs and hallucinations from large language models (LLMs) are major obstacles to reliable AI systems. When different proprietary reasoning models (RMs), such as those by OpenAI, Google... |
| Elevating Semantic Exploration: A Novel Approach Utilizing Distributed Repositories | Valerio Bellandi | 2025-05-06 | 下载 | Centralized and distributed systems are two main approaches to organizing ICT infrastructure, each with its pros and cons. Centralized systems concentrate resources in one location, making management ... |
| The Tensor-Core Beamformer: A High-Speed Signal-Processing Library for Multidisciplinary Use | Leon Oostrum, Bram Veenboer, Ronald Rook, Michael Brown, Pieter Kruizinga, John W. Romein | 2025-05-06 | 下载 | Beamforming is a well-known technique to combine signals from multiple sensors. It has a wide range of application domains. This paper introduces the Tensor-Core Beamformer: a generic, optimized beamf... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Terahertz Spatial Wireless Channel Modeling with Radio Radiance Field | John Song, Lihao Zhang, Feng Ye, Haijian Sun | 2025-05-06 | 下载 | Terahertz (THz) communication is a key enabler for 6G systems, offering ultra-wide bandwidth and unprecedented data rates. However, THz signal propagation differs significantly from lower-frequency ba... |
| Intelligent Load Balancing Systems using Reinforcement Learning System | Raju Singh | 2025-05-06 | 下载 | Load Balancing is a fundamental technology for scaling cloud infrastructure. It enables systems to distribute incoming traffic across backend servers using predefined algorithms such as round robin, w... |
| Hybrid Quantum-Classical Maximum-Likelihood Detection via Grover-based Adaptive Search for RIS-assisted Broadband Wireless Systems | Maryam Tariq, Raneem Abdelrahim, Omar Alhussein, Sami Muhaidat | 2025-05-06 | 下载 | The escalating complexity and stringent performance demands of sixth-generation wireless systems necessitate advanced signal processing methods capable of simultaneously achieving high spectral effici... |
| Minimum Congestion Routing of Unsplittable Flows in Data-Center Networks | Miguel Ferreira, Nirav Atre, Justine Sherry, Michael Dinitz, João Luís Sobrinho | 2025-05-06 | 下载 | Millions of flows are routed concurrently through a modern data-center. These networks are often built as Clos topologies, and flow demands are constrained only by the link capacities at the ingress a... |
| Task-Oriented Multimodal Token Transmission in Resource-Constrained Multiuser Networks | Junhe Zhang, Wanli Ni, Pengwei Wang, Dongyu Wang | 2025-05-06 | 下载 | With the emergence of large model-based agents, widely adopted transformer-based architectures inevitably produce excessively long token embeddings for transmission, which may result in high bandwidth... |
| Multi-Agent Reinforcement Learning Scheduling to Support Low Latency in Teleoperated Driving | Giacomo Avanzi, Marco Giordani, Michele Zorzi | 2025-05-06 | 下载 | The teleoperated driving (TD) scenario comes with stringent Quality of Service (QoS) communication constraints, especially in terms of end-to-end (E2E) latency and reliability. |
| Advancing Remote and Continuous Cardiovascular Patient Monitoring through a Novel and Resource-efficient IoT-Driven Framework | Sanam Nayab, Sohail Raza Chohan, Aqsa Jameel, Syed Rehan Shah, Syed Ahsan Masud Zaidi, Aditya Nath Jha, Kamran Siddique | 2025-05-06 | 下载 | Cardiovascular diseases are a leading cause of fatalities worldwide, often occurring suddenly with limited time for intervention. Current healthcare monitoring systems for cardiac patients rely heavil... |
| Efficient Wi-Fi Sensing for IoT Forensics with Lossy Compression of CSI Data | Paolo Cerutti, Fabio Palmese, Marco Cominelli, Alessandro E. C. Redondi | 2025-05-06 | 下载 | Wi-Fi sensing is an emerging technology that uses channel state information (CSI) from ambient Wi-Fi signals to monitor human activity without the need for dedicated sensors. |
| A Trustworthy Multi-LLM Network: Challenges,Solutions, and A Use Case | Haoxiang Luo, Gang Sun, Yinqiu Liu, Dusit Niyato, Hongfang Yu, Mohammed Atiquzzaman, Schahram Dustdar | 2025-05-06 | 下载 | Large Language Models (LLMs) demonstrate strong potential across a variety of tasks in communications and networking due to their advanced reasoning capabilities. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Prism: Unleashing GPU Sharing for Cost-Efficient Multi-LLM Serving | Shan Yu, Jiarong Xing, Yifan Qiao, Mingyuan Ma, Yangmin Li, Yang Wang, Shuo Yang, Zhiqiang Xie, Shiyi Cao, Ke Bao, Ion Stoica, Harry Xu, Ying Sheng | 2025-05-06 | 下载 | Serving large language models (LLMs) is expensive, especially for providers hosting many models, making cost reduction essential. The unique workload patterns of serving multiple LLMs (i.e. |
| Can Large Language Models Predict Parallel Code Performance? | Gregory Bolet, Giorgis Georgakoudis, Harshitha Menon, Konstantinos Parasyris, Niranjan Hasabnis, Hayden Estes, Kirk W. Cameron, Gal Oren | 2025-05-06 | 下载 | Accurate determination of the performance of parallel GPU code typically requires execution-time profiling on target hardware -- an increasingly prohibitive step due to limited access to high-end GPUs... |
| Minimum Congestion Routing of Unsplittable Flows in Data-Center Networks | Miguel Ferreira, Nirav Atre, Justine Sherry, Michael Dinitz, João Luís Sobrinho | 2025-05-06 | 下载 | Millions of flows are routed concurrently through a modern data-center. These networks are often built as Clos topologies, and flow demands are constrained only by the link capacities at the ingress a... |
| Benchmark-based Study of CPU/GPU Power-Related Features through JAX and TensorFlow | Roblex Nana Tchakoute, Claude Tadonki, Petr Dokladal, Youssef Mesri | 2025-05-06 | 下载 | Power management has become a crucial focus in the modern computing landscape, considering that {\em energy} is increasingly recognized as a critical resource. |
| TeleEval-OS: Performance evaluations of large language models for operations scheduling | Yanyan Wang, Yingying Wang, Junli Liang, Yin Xu, Yunlong Liu, Yiming Xu, Zhengwang Jiang, Zhehe Li, Fei Li, Long Zhao, Kuang Xu, Qi Song, Xiangyang Li | 2025-05-06 | 下载 | The rapid advancement of large language models (LLMs) has significantly propelled progress in artificial intelligence, demonstrating substantial application potential across multiple specialized domai... |