Skip to content

2025-07-01

cs.AR - Architecture

标题作者发布日期PDF摘要
CarbonClarity: Understanding and Addressing Uncertainty in Embodied Carbon for Sustainable ComputingXuesi Chen, Leo Han, Anvita Bhagavathula, Udit Gupta2025-07-01下载Embodied carbon footprint modeling has become an area of growing interest due to its significant contribution to carbon emissions in computing.
How Fast Can Graph Computations Go on Fine-grained Parallel ArchitecturesYuqing Wang, Charles Colley, Brian Wheatman, Jiya Su, David F. Gleich, Andrew A. Chien2025-07-01下载Large-scale graph problems are of critical and growing importance and historically parallel architectures have provided little support. In the spirit of co-design, we explore the question, How fast ca...
RaGNNarok: A Light-Weight Graph Neural Network for Enhancing Radar Point Clouds on Unmanned Ground VehiclesDavid Hunt, Shaocheng Luo, Spencer Hallyburton, Shafii Nillongo, Yi Li, Tingjun Chen, Miroslav Pajic2025-07-01下载Low-cost indoor mobile robots have gained popularity with the increasing adoption of automation in homes and commercial spaces. However, existing lidar and camera-based solutions have limitations such...
A New Family of Thread to Core Allocation Policies for an SMT ARM ProcessorMarta Navarro, Josué Feliu, Salvador Petit, María E. Gómez, Julio Sahuquillo2025-07-01下载Modern high-performance servers commonly integrate Simultaneous Multithreading (SMT) processors, which efficiently boosts throughput over single-threaded cores.
VEDA: Efficient LLM Generation Through Voting-based KV Cache Eviction and Dataflow-flexible AcceleratorZhican Wang, Hongxiang Fan, Haroon Waris, Gang Wang, Zhenyu Li, Jianfei Jiang, Yanan Sun, Guanghui He2025-07-01下载Large Language Models (LLMs) excel in natural language processing tasks but pose significant computational and memory challenges for edge deployment due to their intensive resource demands.
ChatHLS: Towards Systematic Design Automation and Optimization for High-Level SynthesisRunkai Li, Jia Xiong, Xiuyuan He, Jiaqi Lv, Jieru Zhao, Xi Wang2025-07-01下载The increasing complexity of computational demands has spurred the adoption of domain-specific accelerators, yet traditional hardware design methodologies remain constrained by prolonged development a...
Presto: Hardware Acceleration of Ciphers for Hybrid Homomorphic EncryptionYeonsoo Jeon, Mattan Erez, Michael Orshansky2025-07-01下载Hybrid Homomorphic Encryption (HHE) combines symmetric key and homomorphic encryption to reduce ciphertext expansion crucial in client-server deployments of HE.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Capacity Planning and Scheduling for Jobs with Uncertainty in Resource Usage and DurationSunandita Patra, Mehtab Pathan, Mahmoud Mahfouz, Parisa Zehtabi, Wided Ouaja, Daniele Magazzeni, Manuela Veloso2025-07-01下载Organizations around the world schedule jobs (programs) regularly to perform various tasks dictated by their end users. With the major movement towards using a cloud computing infrastructure, our orga...
FLARE: A Dataflow-Aware and Scalable Hardware Architecture for Neural-Hybrid Scientific Lossy CompressionWenqi Jia, Ying Huang, Jian Xu, Zhewen Hu, Sian Jin, Jiannan Tian, Yuede Ji, Miao Yin2025-07-01下载Scientific simulation leveraging high-performance computing (HPC) systems is crucial for modeling complex systems and phenomena in fields such as astrophysics, climate science, and fluid dynamics, gen...
Stannic: Systolic STochAstic ONliNe SchedulIng AcCeleratorAdam H. Ross, Vairavan Palaniappan, Debjit Pal2025-07-01下载Efficient workload scheduling is a critical challenge in modern heterogeneous computing environments, particularly in high-performance computing (HPC) systems.
Efficient Gate Reordering for Distributed Quantum Compiling in Data CentersRiccardo Mengoni, Walter Nadalin, Mathys Rennela, Jimmy Rotureau, Tom Darras, Julien Laurat, Eleni Diamanti, Ioannis Lavdas2025-07-01下载Just as classical computing relies on distributed systems, the quantum computing era requires new kinds of infrastructure and software tools. Quantum networks will become the backbone of hybrid, quant...
How Fast Can Graph Computations Go on Fine-grained Parallel ArchitecturesYuqing Wang, Charles Colley, Brian Wheatman, Jiya Su, David F. Gleich, Andrew A. Chien2025-07-01下载Large-scale graph problems are of critical and growing importance and historically parallel architectures have provided little support. In the spirit of co-design, we explore the question, How fast ca...
Turning AI Data Centers into Grid-Interactive Assets: Results from a Field Demonstration in Phoenix, ArizonaPhilip Colangelo, Ayse K. Coskun, Jack Megrue, Ciaran Roberts, Shayan Sengupta, Varun Sivaram, Ethan Tiao, Aroon Vijaykar, Chris Williams, Daniel C. Wilson, Zack MacFarland, Daniel Dreiling, Nathan Morey, Anuja Ratnayake, Baskar Vairamohan2025-07-01下载Artificial intelligence (AI) is fueling exponential electricity demand growth, threatening grid reliability, raising prices for communities paying for new energy infrastructure, and stunting AI innova...
A New Family of Thread to Core Allocation Policies for an SMT ARM ProcessorMarta Navarro, Josué Feliu, Salvador Petit, María E. Gómez, Julio Sahuquillo2025-07-01下载Modern high-performance servers commonly integrate Simultaneous Multithreading (SMT) processors, which efficiently boosts throughput over single-threaded cores.
yProv4ML: Effortless Provenance Tracking for Machine Learning SystemsGabriele Padovani, Valentine Anantharaj, Sandro Fiore2025-07-01下载The rapid growth of interest in large language models (LLMs) reflects their potential for flexibility and generalization, and attracted the attention of a diverse range of researchers.
PANDAS: Peer-to-peer, Adaptive Networking for Data Availability Sampling within Ethereum Consensus TimeboundsMatthieu Pigaglio, Onur Ascigil, Michał Król, Sergi Rene, Felix Lange, Kaleem Peeroo, Ramin Sadre, Vladimir Stankovic, Etienne Rivière2025-07-01下载Layer-2 protocols can assist Ethereum's limited throughput, but globally broadcasting layer-2 data limits their scalability. The Danksharding evolution of Ethereum aims to support the selective distri...
Provenance Tracking in Large-Scale Machine Learning SystemsGabriele Padovani, Valentine Anantharaj, Sandro Fiore2025-07-01下载As the demand for large scale AI models continues to grow, the optimization of their training to balance computational efficiency, execution time, accuracy and energy consumption represents a critical...
Safe Low Bandwidth SPV: A Formal Treatment of Simplified Payment Verification Protocols and Security BoundsCraig S Wright2025-07-01下载This paper presents a complete formal specification, protocol description, and mathematical proof structure for Simplified Payment Verification (SPV) as originally defined in the Bitcoin whitepaper \c...
Accelerating Loading WebGraphs in ParaGrapherMohsen Koohi Esfahani2025-07-01下载ParaGrapher is a graph loading API and library that enables graph processing frameworks to load large-scale compressed graphs with minimal overhead.
Toward Edge General Intelligence with Multiple-Large Language Model (Multi-LLM): Architecture, Trust, and OrchestrationHaoxiang Luo, Yinqiu Liu, Ruichen Zhang, Jiacheng Wang, Gang Sun, Dusit Niyato, Hongfang Yu, Zehui Xiong, Xianbin Wang, Xuemin Shen2025-07-01下载Edge computing enables real-time data processing closer to its source, thus improving the latency and performance of edge-enabled AI applications.
DynoStore: A wide-area distribution system for the management of data over heterogeneous storageDante D. Sanchez-Gallegos, J. L. Gonzalez-Compean, Maxime Gonthier, Valerie Hayot-Sasson, J. Gregory Pauloski, Haochen Pan, Kyle Chard, Jesus Carretero, Ian Foster2025-07-01下载Data distribution across different facilities offers benefits such as enhanced resource utilization, increased resilience through replication, and improved performance by processing data near its sour...
Collaborative Multi-Agent Reinforcement Learning Approach for Elastic Cloud Resource ScalingBruce Fang, Danyi Gao2025-07-01下载This paper addresses the challenges of rapid resource variation and highly uncertain task loads in cloud computing environments. It proposes an optimization method for elastic cloud resource scaling b...
Edge Computing and its Application in Robotics: A SurveyNazish Tahir, Ramviyas Parasuraman2025-07-01下载The Edge computing paradigm has gained prominence in both academic and industry circles in recent years. By implementing edge computing facilities and services in robotics, it becomes a key enabler in...
Towards Resource-Efficient Serverless LLM Inference with SLINFERChuhao Xu, Zijun Li, Quan Chen, Han Zhao, Xueyan Tang, Minyi Guo2025-07-01下载The rise of LLMs has driven demand for private serverless deployments, characterized by moderate-sized models and infrequent requests. While existing serverless solutions follow exclusive GPU allocati...
Real-Time In-Network Machine Learning on P4-Programmable FPGA SmartNICs with Fixed-Point Arithmetic and TaylorMohammad Firas Sada, John J. Graham, Mahidhar Tatineni, Dmitry Mishin, Thomas A. DeFanti, Frank Würthwein2025-07-01下载As machine learning (ML) applications become integral to modern network operations, there is an increasing demand for network programmability that enables low-latency ML inference for tasks such as Qu...
Find a Scapegoat: Poisoning Membership Inference Attack and Defense to Federated LearningWenjin Mo, Zhiyuan Li, Minghong Fang, Mingwei Fang2025-07-01下载Federated learning (FL) allows multiple clients to collaboratively train a global machine learning model with coordination from a central server, without needing to share their raw data.
Serving LLMs in HPC Clusters: A Comparative Study of Qualcomm Cloud AI 100 Ultra and NVIDIA Data Center GPUsMohammad Firas Sada, John J. Graham, Elham E Khoda, Mahidhar Tatineni, Dmitry Mishin, Rajesh K. Gupta, Rick Wagner, Larry Smarr, Thomas A. DeFanti, Frank Würthwein2025-07-01下载This study presents a benchmarking analysis of the Qualcomm Cloud AI 100 Ultra (QAic) accelerator for large language model (LLM) inference, evaluating its energy efficiency (throughput per watt), perf...
HelixPipe: Efficient Distributed Training of Long Sequence Transformers with Attention Parallel Pipeline ParallelismGeng Zhang, Shenggan Cheng, Xuanlei Zhao, Ziming Liu, Yang You2025-07-01下载As transformer sequence lengths grow, existing pipeline parallelisms incur suboptimal performance due to the quadratic attention computation and the substantial memory overhead.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
A Full-Stack Platform Architecture for Self-Organised Social CoordinationMatthew Scott, Jeremy Pitt2025-07-01下载To mitigate the restrictive centralising and monopolistic tendencies of platformisation, we aim to empower local communities by democratising platforms for self-organised social coordination.
QUIC Delay Control: an implementation of congestion and delay controlSaverio Mascolo, Andrea Vittorio Balillo, Gioacchino Manfredi, Davide D'Agostino, Luca De Cicco2025-07-01下载A new congestion and delay control algorithm named QUIC Delay Control (QUIC-DC) is proposed for controlling not only congestion but also the queueing delay encountered along the forward communication ...
Enhancing Vehicular Platooning with Wireless Federated Learning: A Resource-Aware Control FrameworkBeining Wu, Jun Huang, Qiang Duan, Liang Dong, Zhipeng Cai2025-07-01下载This paper aims to enhance the performance of Vehicular Platooning (VP) systems integrated with Wireless Federated Learning (WFL). In highly dynamic environments, vehicular platoons experience frequen...
Stealtooth: Breaking Bluetooth Security Abusing Silent Automatic PairingKeiichiro Kimura, Hiroki Kuzuno, Yoshiaki Shiraishi, Masakatu Morii2025-07-01下载Bluetooth is a pervasive wireless communication technology used by billions of devices for short-range connectivity. The security of Bluetooth relies on the pairing process, where devices establish sh...
PANDAS: Peer-to-peer, Adaptive Networking for Data Availability Sampling within Ethereum Consensus TimeboundsMatthieu Pigaglio, Onur Ascigil, Michał Król, Sergi Rene, Felix Lange, Kaleem Peeroo, Ramin Sadre, Vladimir Stankovic, Etienne Rivière2025-07-01下载Layer-2 protocols can assist Ethereum's limited throughput, but globally broadcasting layer-2 data limits their scalability. The Danksharding evolution of Ethereum aims to support the selective distri...
Toward Edge General Intelligence with Multiple-Large Language Model (Multi-LLM): Architecture, Trust, and OrchestrationHaoxiang Luo, Yinqiu Liu, Ruichen Zhang, Jiacheng Wang, Gang Sun, Dusit Niyato, Hongfang Yu, Zehui Xiong, Xianbin Wang, Xuemin Shen2025-07-01下载Edge computing enables real-time data processing closer to its source, thus improving the latency and performance of edge-enabled AI applications.
Remote Rendering for Virtual Reality: performance comparison of multimedia frameworks and protocolsDaniel Mejías, Inhar Yeregui, Roberto Viola, Miguel Fernández, Mario Montagud2025-07-01下载The increasing complexity of Extended Reality (XR) applications demands substantial processing power and high bandwidth communications, often unavailable on lightweight devices.
Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network TroubleshootingZhihao Wang, Alessandro Cornacchia, Franco Galante, Carlo Centofanti, Alessio Sacco, Dingde Jiang2025-07-01下载Recent research has demonstrated the effectiveness of Artificial Intelligence (AI), and more specifically, Large Language Models (LLMs), in supporting network configuration synthesis and automating ne...
Edge Computing and its Application in Robotics: A SurveyNazish Tahir, Ramviyas Parasuraman2025-07-01下载The Edge computing paradigm has gained prominence in both academic and industry circles in recent years. By implementing edge computing facilities and services in robotics, it becomes a key enabler in...
Real-Time In-Network Machine Learning on P4-Programmable FPGA SmartNICs with Fixed-Point Arithmetic and TaylorMohammad Firas Sada, John J. Graham, Mahidhar Tatineni, Dmitry Mishin, Thomas A. DeFanti, Frank Würthwein2025-07-01下载As machine learning (ML) applications become integral to modern network operations, there is an increasing demand for network programmability that enables low-latency ML inference for tasks such as Qu...
Seeing Through the Fog: Empowering Mobile Devices to Expose and Mitigate RAN Buffer Effects on Delay-Sensitive ProtocolsYuxin Liu, Tianyang Zhang, Kyle Jamieson, Yaxiong Xie2025-07-01下载Delay-based protocols rely on end-to-end delay measurements to detect network congestion. However, in cellular networks, Radio Access Network (RAN) buffers introduce significant delays unrelated to co...

cs.PF - Performance

标题作者发布日期PDF摘要
Development and Comparative Evaluation of Three Artificial Intelligence Models (NLP, LLM, JEPA) for Predicting Triage in Emergency Departments: A 7-Month Retrospective Proof-of-ConceptEdouard Lansiaux, Ramy Azzouz, Emmanuel Chazard, Amélie Vromant, Eric Wiel2025-07-01下载Emergency departments struggle with persistent triage errors, especially undertriage and overtriage, which are aggravated by growing patient volumes and staff shortages.
Turning AI Data Centers into Grid-Interactive Assets: Results from a Field Demonstration in Phoenix, ArizonaPhilip Colangelo, Ayse K. Coskun, Jack Megrue, Ciaran Roberts, Shayan Sengupta, Varun Sivaram, Ethan Tiao, Aroon Vijaykar, Chris Williams, Daniel C. Wilson, Zack MacFarland, Daniel Dreiling, Nathan Morey, Anuja Ratnayake, Baskar Vairamohan2025-07-01下载Artificial intelligence (AI) is fueling exponential electricity demand growth, threatening grid reliability, raising prices for communities paying for new energy infrastructure, and stunting AI innova...
PANDAS: Peer-to-peer, Adaptive Networking for Data Availability Sampling within Ethereum Consensus TimeboundsMatthieu Pigaglio, Onur Ascigil, Michał Król, Sergi Rene, Felix Lange, Kaleem Peeroo, Ramin Sadre, Vladimir Stankovic, Etienne Rivière2025-07-01下载Layer-2 protocols can assist Ethereum's limited throughput, but globally broadcasting layer-2 data limits their scalability. The Danksharding evolution of Ethereum aims to support the selective distri...
Empirical Analysis Of Heuristic and Approximation Algorithms for the The Mutual-Visibility ProblemVanja Stojanović, Bor Pangeršič2025-07-01下载The NP-complete mutual-visibility (MV) problem currently lacks empirical analysis on its practical behaviour despite theoretical studies. This paper addresses this gap by implementing and evaluating t...
Twill: Scheduling Compound AI Systems on Heterogeneous Mobile Edge PlatformsZain Taufique, Aman Vyas, Antonio Miele, Pasi Liljeberg, Anil Kanduri2025-07-01下载Compound AI (cAI) systems chain multiple AI models to solve complex problems. cAI systems are typically composed of deep neural networks (DNNs), transformers, and large language models (LLMs), exhibit...

基于 VitePress 构建