Skip to content

2025-05-19

cs.AR - Architecture

标题作者发布日期PDF摘要
Genesis: A Compiler Framework for Hamiltonian Simulation on Hybrid CV-DV Quantum ComputersZihan Chen, Jiakang Li, Minghao Guo, Henry Chen, Zirui Li, Joel Bierman, Yipeng Huang, Huiyang Zhou, Yuan Liu, Eddy Z. Zhang2025-05-19下载This paper introduces Genesis, the first compiler designed to support Hamiltonian Simulation on hybrid continuous-variable (CV) and discrete-variable (DV) quantum computing systems.
Introducing Instruction-Accurate Simulators for Performance Estimation of Autotuning WorkloadsRebecca Pelke, Nils Bosbach, Lennart M. Reimann, Rainer Leupers2025-05-19下载Accelerating Machine Learning (ML) workloads requires efficient methods due to their large optimization space. Autotuning has emerged as an effective approach for systematically evaluating variations ...
MXDOTP: A RISC-V ISA Extension for Enabling Microscaling (MX) Floating-Point Dot ProductsGamze İslamoğlu, Luca Bertaccini, Arpan Suravi Prasad, Francesco Conti, Angelo Garofalo, Luca Benini2025-05-19下载Fast and energy-efficient low-bitwidth floating-point (FP) arithmetic is essential for Artificial Intelligence (AI) systems. Microscaling (MX) standardized formats have recently emerged as a promising...
PIM-malloc: A Fast and Scalable Dynamic Memory Allocator for Processing-In-Memory (PIM) ArchitecturesDongjae Lee, Bongjoon Hyun, Youngjin Kwon, Minsoo Rhu2025-05-19下载The ability to dynamically allocate memory is fundamental in modern programming languages. However, this feature is not adequately supported in current general-purpose PIM devices.
Addressing memory bandwidth scalability in vector processors for streaming applicationsJordi Altayo, Paul Delestrac, David Novo, Simey Yang, Debjyoti Bhattacharjee, Francky Catthoor2025-05-19下载As the size of artificial intelligence and machine learning (AI/ML) models and datasets grows, the memory bandwidth becomes a critical bottleneck.
2T1R Regulated Memristor Conductance Control Array Architecture for Neuromorphic Computing using 28nm CMOS TechnologyNeethu Kuriakose, Arun Ashok, Christian Grewing, André Zambanini, Stefan van Waasen2025-05-19下载Memristors are promising devices for scalable and low power, in-memory computing to improve the energy efficiency of a rising computational demand.
FireFly-T: High-Throughput Sparsity Exploitation for Spiking Transformer Acceleration with Dual-Engine Overlay ArchitectureTenglong Li, Jindong Li, Guobin Shen, Dongcheng Zhao, Qian Zhang, Yi Zeng2025-05-19下载Spiking transformers are emerging as a promising architecture that combines the energy efficiency of Spiking Neural Networks (SNNs) with the powerful attention mechanisms of transformers.
Sandwich: Separating Prefill-Decode Compilation for Efficient CPU LLM ServingJuntao Zhao, Jiuru Li, Chuan Wu2025-05-19下载Utilizing CPUs to serve large language models (LLMs) is a resource-friendly alternative to GPU serving. Existing CPU-based solutions ignore workload differences between the prefill and the decode phas...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Eudoxia: a FaaS scheduling simulator for the composable lakehouseTapan Srivastava, Jacopo Tagliabue, Ciro Greco2025-05-19下载Due to the variety of its target use cases and the large API surface area to cover, a data lakehouse (DLH) is a natural candidate for a composable data system.
Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training and InferenceShuqing Luo, Pingzhi Li, Jie Peng, Hanrui Wang, Yang, Zhao, Yu, Cao, Yu Cheng, Tianlong Chen2025-05-19下载Mixture-of-experts (MoE) architectures could achieve impressive computational efficiency with expert parallelism, which relies heavily on all-to-all communication across devices.
SVAFD: A Secure and Verifiable Co-Aggregation Protocol for Federated DistillationTian Wen, Sheng Sun, Yuwei Wang, Peiyan Chen, Zhiyuan Wu, Min Liu, Bo Gao2025-05-19下载Secure Aggregation (SA) is an indispensable component of Federated Learning (FL) that concentrates on privacy preservation while allowing for robust aggregation.
eBPF-Based Instrumentation for Generalisable Diagnosis of Performance DegradationDiogo Landau, Jorge Barbosa, Nishant Saurabh2025-05-19下载Online Data Intensive applications (e.g. message brokers, ML inference and databases) are core components of the modern internet, providing critical functionalities to connecting services.
Prink: ksk_s-Anonymization for Streaming Data in Apache FlinkPhilip Groneberg, Saskia Nuñez von Voigt, Thomas Janke, Louis Loechel, Karl Wolf, Elias Grünewald, Frank Pallas2025-05-19下载In this paper, we present Prink, a novel and practically applicable concept and fully implemented prototype for ks-anonymizing data streams in real-world application architectures.
Computing the Schulze Method for Large-Scale Preference Data SetsTheresa Csar, Martin Lackner, Reinhard Pichler2025-05-19下载The Schulze method is a voting rule widely used in practice and enjoys many positive axiomatic properties. While it is computable in polynomial time, its straight-forward implementation does not scale...
Minos: Exploiting Cloud Performance Variation with Function-as-a-Service Instance SelectionTrever Schirmer, Natalie Carl, Nils Höller, Tobias Pfandzelter, David Bermbach2025-05-19下载Serverless Function-as-a-Service (FaaS) is a popular cloud paradigm to quickly and cheaply implement complex applications. Because the function instances cloud providers start to execute user code run...
Optimization of Hybrid Quantum-Classical AlgorithmsLian Remme, Alexander Weinert, Andre Waschk2025-05-19下载Quantum computers do not run in isolation; rather, they are embedded in quantum-classical hybrid architectures. In these setups, a quantum processing unit communicates with a classical device in near-...
Performance Characterization of Distributed Deep Learning Strategies: A Quantitative Evaluation of DDP, FSDP, and Parameter Server Architectures on GPU ClustersMd Sultanul Islam Ovi2025-05-19下载Efficiently scaling deep neural networks across GPU clusters requires navigating complex trade-offs between computational throughput, memory utilization, and synchronization overhead.
Learning In Chaos: Efficient Autoscaling and Self-Healing for Multi-Party Distributed TrainingWenjiao Feng, Rongxing Xiao, Zonghang Li, Hongfang Yu, Gang Sun, Long Luo, Mohsen Guizani, Qirong Ho, Steve Liu2025-05-19下载Node and link churn in multi-party, cross-region clusters over wide-area networks (WANs) often disrupts distributed training. However, checkpoint-based recovery and cloud-centric autoscaling react slo...
Sandwich: Separating Prefill-Decode Compilation for Efficient CPU LLM ServingJuntao Zhao, Jiuru Li, Chuan Wu2025-05-19下载Utilizing CPUs to serve large language models (LLMs) is a resource-friendly alternative to GPU serving. Existing CPU-based solutions ignore workload differences between the prefill and the decode phas...
MTGenRec: An Efficient Distributed Training System for Generative Recommendation Models in MeituanYuxiang Wang, Xiao Yan, Chi Ma, Mincong Huang, Xiaoguang Li, Lei Yu, Chuan Liu, Ruidong Han, He Jiang, Bin Yin, Shangyu Chen, Fei Jiang, Xiang Li, Wei Lin, Haowei Han, Bo Du, Jiawei Jiang2025-05-19下载Recommendation is crucial for both user experience and company revenue in Meituan as a leading lifestyle company, and generative recommendation models (GRMs) are shown to produce quality recommendatio...
Digital Twins in the Cloud: A Modular, Scalable and Interoperable Framework for Accelerating Verification and Validation of Autonomous Driving SolutionsTanmay Vilas Samak, Chinmay Vilas Samak, Giovanni Martino, Pranav Nair, Venkat Krovi2025-05-19下载Verification and validation (V&V) of autonomous vehicles (AVs) typically requires exhaustive testing across a variety of operating environments and driving scenarios including rare, extreme, or hazard...
HydraInfer: Hybrid Disaggregated Scheduling for Multimodal Large Language Model ServingXianzhe Dong, Tongxuan Liu, Yuting Zeng, Liangyu Liu, Yang Liu, Siyu Wu, Yu Wu, Hailong Yang, Ke Zhang, Jing Li2025-05-19下载Multimodal Large Language Models (MLLMs) have been rapidly advancing, enabling cross-modal understanding and generation, and propelling artificial intelligence towards artificial general intelligence.
Quantum Modeling of Spatial Contiguity ConstraintsYunhan Chang, Amr Magdy, Federico M. Spedalieri2025-05-19下载Quantum computing has demonstrated potential for solving complex optimization problems; however, its application to spatial regionalization remains underexplored.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Incremental Firmware Update Over-the-Air for Low-Power IoT Devices over LoRaWANAndrea De Simone, Giovanna Turvani, Fabrizio Riente2025-05-19下载Efficiently supporting remote firmware updates in Internet of Things (IoT) devices remains a significant challenge due to the limitations of many IoT communication protocols, which often make it impra...
Sionna Research Kit: A GPU-Accelerated Research Platform for AI-RANSebastian Cammerer, Guillermo Marcus, Tobias Zirr, Fayçal Aït Aoudia, Lorenzo Maggi, Jakob Hoydis, Alexander Keller2025-05-19下载We introduce the NVIDIA Sionna Research Kit, a GPU-accelerated research platform for developing and testing AI/ML algorithms in 5G NR cellular networks.
Learning Driven Elastic Task Multi-Connectivity Immersive Computing SystemsBabak Badnava, Jacob Chakareski, Morteza Hashemi2025-05-19下载In virtual reality (VR) environments, computational tasks exhibit an elastic nature, meaning they can dynamically adjust based on various user and system constraints.
Graph Neural Networks Based Anomalous RSSI DetectionBlaž Bertalanič, Matej Vnučec, Carolina Fortuna2025-05-19下载In today's world, modern infrastructures are being equipped with information and communication technologies to create large IoT networks. It is essential to monitor these networks to ensure smooth o...
Forewarned is Forearmed: A Survey on Large Language Model-based Agents in Autonomous CyberattacksMinrui Xu, Jiani Fan, Xinyu Huang, Conghao Zhou, Jiawen Kang, Dusit Niyato, Shiwen Mao, Zhu Han, Xuemin, Shen, Kwok-Yan Lam2025-05-19下载With the continuous evolution of Large Language Models (LLMs), LLM-based agents have advanced beyond passive chatbots to become autonomous cyber entities capable of performing complex tasks, including...
Confidence-Regulated Generative Diffusion Models for Reliable AI Agent Migration in Vehicular MetaversesYingkai Kang, Jiawen Kang, Jinbo Wen, Tao Zhang, Zhaohui Yang, Dusit Niyato, Yan Zhang2025-05-19下载Vehicular metaverses are an emerging paradigm that merges intelligent transportation systems with virtual spaces, leveraging advanced digital twin and Artificial Intelligence (AI) technologies to seam...
An Automated Blackbox Noncompliance Checker for QUIC Server ImplementationsKian Kai Ang, Guy Farrelly, Cheryl Pope, Damith C. Ranasinghe2025-05-19下载We develop QUICtester, an automated approach for uncovering non-compliant behaviors in the ratified QUIC protocol implementations (RFC 9000/9001).

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Testing Access-Control Configuration Changes for Web ApplicationsChengcheng Xiang, Li Zhong, Eric Mugnier, Nathaniel Nguyen, Yuanyuan Zhou, Tianyin Xu2025-05-19下载Access-control misconfigurations are among the main causes of today's data breaches in web applications. However, few techniques are available to support automatic and systematic testing for access-co...

cs.PF - Performance

标题作者发布日期PDF摘要
Bayesian Hierarchical Models for Quantitative Estimates for Performance metrics applied to Saddle Search AlgorithmsRohit Goswami2025-05-19下载Rigorous performance evaluation is essential for developing robust algorithms for high-throughput computational chemistry. Traditional benchmarking, however, often struggles to account for system-spec...
SzCORE as a benchmark: report from the seizure detection challenge at the 2025 AI in Epilepsy and Neurological Disorders ConferenceJonathan Dan, Amirhossein Shahbazinia, Christodoulos Kechris, David Atienza2025-05-19下载Reliable automatic seizure detection from long-term EEG remains a challenge, as current machine learning models often fail to generalize across patients or clinical settings.
Net-Zero: A Comparative Study on Neural Network Design for Climate-Economic PDEs Under UncertaintyCarlos Rodriguez-Pardo, Louis Daumas, Leonardo Chiani, Massimo Tavoni2025-05-19下载Climate-economic modeling under uncertainty presents significant computational challenges that may limit policymakers' ability to address climate change effectively.
eBPF-Based Instrumentation for Generalisable Diagnosis of Performance DegradationDiogo Landau, Jorge Barbosa, Nishant Saurabh2025-05-19下载Online Data Intensive applications (e.g. message brokers, ML inference and databases) are core components of the modern internet, providing critical functionalities to connecting services.
Effects of the Auto-Correlation of Delays on the Age of Information: A Gaussian Process FrameworkAtsushi Inoie, Yoshiaki Inoue2025-05-19下载The age of information (AoI) has been studied actively in recent years as a performance measure for systems that require real-time performance, such as remote monitoring systems via communication netw...

基于 VitePress 构建