Skip to content

2025-05-27

cs.AR - Architecture

标题作者发布日期PDF摘要
Improved Prefetching Techniques for Linked Data StructuresNikola Vuk Maruszewski2025-05-27下载With ever-increasing main memory stall times, we need novel techniques to reduce effective memory access latencies. Prefetching has been shown to be an effective solution, especially with contiguous d...
SageAttention2++: A More Efficient Implementation of SageAttention2Jintao Zhang, Xiaoming Xu, Jia Wei, Haofeng Huang, Pengle Zhang, Chendong Xiang, Jun Zhu, Jianfei Chen2025-05-27下载The efficiency of attention is critical because its time complexity grows quadratically with sequence length. SageAttention2 addresses this by utilizing quantization to accelerate matrix multiplicatio...
Static Communication Analysis for Hardware DesignMads Rosendahl, Maja H. Kirkeby2025-05-27下载Hardware acceleration of algorithms is an effective method for improving performance in high-demand computational tasks. However, developing hardware designs for such acceleration fundamentally differ...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Power-Capping Metric Evaluation for Improving Energy Efficiency in HPC ApplicationsMaria Patrou, Thomas Wang, Wael Elwasif, Markus Eisenbach, Ross Miller, William Godoy, Oscar Hernandez2025-05-27下载With high-performance computing systems now running at exascale, optimizing power-scaling management and resource utilization has become more critical than ever.
FedCostAware: Enabling Cost-Aware Federated Learning on the CloudAditya Sinha, Zilinghan Li, Tingkai Liu, Volodymyr Kindratenko, Kibaek Kim, Ravi Madduri2025-05-27下载Federated learning (FL) is a distributed machine learning (ML) approach that allows multiple clients to collaboratively train ML models without exchanging original training data, offering a solution t...
AMSFL: Adaptive Multi-Step Federated Learning via Gradient Difference-Based Error ModelingGanglou Xu2025-05-27下载Federated learning faces critical challenges in balancing communication efficiency and model accuracy. One key issue lies in the approximation of update errors without incurring high computational cos...
STRATUS: A Multi-agent System for Autonomous Reliability Engineering of Modern CloudsYinfang Chen, Jiaqi Pan, Jackson Clark, Yiming Su, Noah Zheutlin, Bhavya Bhavya, Rohan Arora, Yu Deng, Saurabh Jha, Tianyin Xu2025-05-27下载In cloud-scale systems, failures are the norm. A distributed computing cluster exhibits hundreds of machine failures and thousands of disk failures; software bugs and misconfigurations are reported to...
Incentivizing Permissionless Distributed Learning of LLMsJoel Lidin, Amir Sarfi, Evangelos Pappas, Samuel Dare, Eugene Belilovsky, Jacob Steeves2025-05-27下载We describe an incentive system for distributed deep learning of foundational models where peers are rewarded for contributions. The incentive system, \textit{Gauntlet}, has been deployed on the bitte...
KPerfIR: Towards an Open and Compiler-centric Ecosystem for GPU Kernel Performance Tooling on Modern AI WorkloadsYue Guan, Yuanwei Fang, Keren Zhou, Corbin Robeck, Manman Ren, Zhongkai Yu, Yufei Ding, Adnan Aziz2025-05-27下载In this work, we propose KPerfIR, a novel multilevel compiler-centric infrastructure to enable the development of customizable, extendable, and portable profiling tools tailored for modern artificial ...
Fast and Cost-effective Speculative Edge-Cloud Decoding with Early ExitsYeshwanth Venkatesha, Souvik Kundu, Priyadarshini Panda2025-05-27下载Large Language Models (LLMs) enable various applications on edge devices such as smartphones, wearables, and embodied robots. However, their deployment often depends on expensive cloud-based APIs, cre...
Big Data-Driven Fraud Detection Using Machine Learning and Real-Time Stream ProcessingChen Liu, Hengyu Tang, Zhixiao Yang, Ke Zhou, Sangwhan Cha2025-05-27下载In the age of digital finance, detecting fraudulent transactions and money laundering is critical for financial institutions. This paper presents a scalable and efficient solution using Big Data tools...
Distributed Discrete Morse Sandwich: Efficient Computation of Persistence Diagrams for Massive Scalar DataEve Le Guillou, Pierre Fortin, Julien Tierny2025-05-27下载The persistence diagram, which describes the topological features of a dataset, is a key descriptor in Topological Data Analysis. The "Discrete Morse Sandwich" (DMS) method has been reported to be the...
Multi-Event Triggers for Serverless ComputingNatalie Carl, Trever Schirmer, Niklas Kowallik, Joshua Adamek, Tobias Pfandzelter, Sergio Lucia, David Bermbach2025-05-27下载Function-as-a-Service (FaaS) is an event-driven serverless cloud computing model in which small, stateless functions are invoked in response to events, such as HTTP requests, new database entries, or ...
Vectorized Sequence-Based Chunking for Data DeduplicationSreeharsha Udayashankar, Samer Al-Kiswany2025-05-27下载Data deduplication has gained wide acclaim as a mechanism to improve storage efficiency and conserve network bandwidth. Its most critical phase, data chunking, is responsible for the overall space sav...
Constructive community race: full-density spiking neural network model drives neuromorphic computingJohanna Senk, Anno C. Kurth, Steve Furber, Tobias Gemmeke, Bruno Golosio, Arne Heittmann, James C. Knight, Eric Müller, Tobias Noll, Thomas Nowotny, Gorka Peraza Coppola, Luca Peres, Oliver Rhodes, Andrew Rowley, Johannes Schemmel, Tim Stadtmann, Tom Tetzlaff, Gianmarco Tiddia, Sacha J. van Albada, José Villamar, Markus Diesmann2025-05-27下载The local circuitry of the mammalian brain is a focus of the search for generic computational principles because it is largely conserved across species and modalities.
SHE-LoRA: Selective Homomorphic Encryption for Federated Tuning with Heterogeneous LoRAJianmin Liu, Li Yan, Borui Li, Lei Yu, Chao Shen2025-05-27下载Federated fine-tuning is critical for improving the performance of large language models (LLMs) in handling domain-specific tasks while keeping training data decentralized and private.
A Hitchhiker's Guide to Privacy-Preserving Digital Payment Systems: A Survey on Anonymity, Confidentiality, and AuditabilityMatteo Nardelli, Francesco De Sclavis, Michela Iezzi2025-05-27下载Crypto-assets and central bank digital currencies (CBDCs) are reshaping how value is exchanged in distributed computing environments. These systems combine cryptographic primitives, protocol design, a...
Complexity landscape for local certificationNicolas Bousquet, Laurent Feuilloley, Sébastien Zeitoun2025-05-27下载An impressive recent line of work has charted the complexity landscape of distributed graph algorithms. For many settings, it has been determined which time complexities exist, and which do not (in th...
Reduced and mixed precision turbulent flow simulations using explicit finite difference schemesBálint Siklósi, Pushpender K. Sharma, David J. Lusher, István Z. Reguly, Neil D. Sandham2025-05-27下载The use of reduced and mixed precision computing has gained increasing attention in high-performance computing (HPC) as a means to improve computational efficiency, particularly on modern hardware arc...
Load Balancing in Strongly Inhomogeneous Simulations -- a Vlasiator Case StudyLeo Kotipalo, Markus Battarbee, Yann Pfau-Kempf, Vertti Tarvus, Minna Palmroth2025-05-27下载Parallelization is a necessity for large-scale simulations due to the amount of data processed. In this article we investigate different load balancing methods using Vlasiator, a global magnetospheric...
An Efficient Implementation of Guard-Based Synchronization for an Object-Oriented Programming LanguageShucai Yao, Emil Sekerinski2025-05-27下载In the shared variable model of concurrency, guarded atomic actions restrict the possible interference between processes by regions of atomic execution.
Choreographies as MacrosAlexander Bohosian, Andrew K. Hirsch2025-05-27下载Concurrent programming often entails meticulous pairing of sends and receives between participants to avoid deadlock. Choreographic programming alleviates this burden by specifying the system as a sin...
ECC-SNN: Cost-Effective Edge-Cloud Collaboration for Spiking Neural NetworksDi Yu, Changze Lv, Xin Du, Linshan Jiang, Wentao Tong, Zhenyu Liao, Xiaoqing Zheng, Shuiguang Deng2025-05-27下载Most edge-cloud collaboration frameworks rely on the substantial computational and storage capabilities of cloud-based artificial neural networks (ANNs).
Time-Series Learning for Proactive Fault Prediction in Distributed Systems with Deep Neural StructuresYang Wang, Wenxuan Zhu, Xuehui Quan, Heyi Wang, Chang Liu, Qiyuan Wu2025-05-27下载This paper addresses the challenges of fault prediction and delayed response in distributed systems by proposing an intelligent prediction method based on temporal feature learning.
InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and SchedulingXiaoxiao Jiang, Suyi Li, Lingyun Yang, Tianyu Feng, Zhipeng Di, Weiyi Lu, Guoxuan Zhu, Xiu Lin, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Liping Zhang, Wei Wang2025-05-27下载Generative image editing using diffusion models has become a prevalent application in today's AI cloud services. In production environments, image editing typically involves a mask that specifies the ...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Video Quality Monitoring for Remote Autonomous Vehicle ControlDimitrios Kafetzis, Nikos Fotiou, Savvas Argyropoulos, Jad Nasreddine, Iordanis Koutsopoulos2025-05-27下载The delivery of high-quality, low-latency video streams is critical for remote autonomous vehicle control, where operators must intervene in real time.
Video Streaming Over QUIC: A Comprehensive StudyJashanjot Singh Sidhu, Abdelhak Bentaleb2025-05-27下载The QUIC transport protocol represents a significant evolution in web transport technologies, offering improved performance and reduced latency compared to traditional protocols like TCP.
Scrapers selectively respect robots.txt directives: evidence from a large-scale empirical studyTaein Kim, Karstan Bock, Claire Luo, Amanda Liswood, Chloe Poroslay, Emily Wenger2025-05-27下载Online data scraping has taken on new dimensions in recent years, as traditional scrapers have been joined by new AI-specific bots. To counteract unwanted scraping, many sites use tools like the Robot...
Multi-photon QKD for Practical Quantum NetworksNitin Jha, Abhishek Parakh, Mahadevan Subramaniam2025-05-27下载Quantum key distribution (QKD) will most likely be an integral part of any practical quantum network in the future. However, not all QKD protocols can be used in today's networks because of the lack o...
A Joint Reconstruction-Triplet Loss Autoencoder Approach Towards Unseen Attack Detection in IoV NetworksJulia Boone, Tolunay Seyfi, Fatemeh Afghah2025-05-27下载Internet of Vehicles (IoV) systems, while offering significant advancements in transportation efficiency and safety, introduce substantial security vulnerabilities due to their highly interconnected n...
Real-World Deployment of Cloud-based Autonomous Mobility Systems for Outdoor and Indoor EnvironmentsYufeng Yang, Minghao Ning, Keqi Shu, Aladdin Saleh, Ehsan Hashemi, Amir Khajepour2025-05-27下载Autonomous mobility systems increasingly operate in dense and dynamic environments where perception occlusions, limited sensing coverage, and multi-agent interactions pose major challenges.
Interference Detection in Spectrum-Blind Multi-User Optical Spectrum as a ServiceAgastya Raj, Daniel C. Kilper, Marco Ruffini2025-05-27下载With the growing demand for high-bandwidth, low-latency applications, Optical Spectrum as a Service (OSaaS) is of interest for flexible bandwidth allocation within Elastic Optical Networks (EONs) and ...
Dynamical ON-OFF Control with Trajectory Prediction for Multi-RIS Wireless NetworksKaining Wang, Bo Yang, Yusheng Lei, Zhiwen Yu, Xuelin Cao, George C. Alexandropoulos, Marco Di Renzo, Chau Yuen2025-05-27下载Reconfigurable intelligent surfaces (RISs) have demonstrated an unparalleled ability to reconfigure wireless environments by dynamically controlling the phase, amplitude, and polarization of impinging...
Respond to Change with Constancy: Instruction-tuning with LLM for Non-I.I.D. Network Traffic ClassificationXinjie Lin, Gang Xiong, Gaopeng Gou, Wenqi Dong, Jing Yu, Zhen Li, Wei Xia2025-05-27下载Encrypted traffic classification is highly challenging in network security due to the need for extracting robust features from content-agnostic traffic data.
Wideband RF Radiance Field Modeling Using Frequency-embedded 3D Gaussian SplattingZechen Li, Lanqing Yang, Yiheng Bian, Hao Pan, Yongjian Fu, Yezhou Wang, Zhuxi Chen, Yi-Chao Chen, Guangtao Xue2025-05-27下载Indoor environments typically contain diverse RF signals distributed across multiple frequency bands, including NB-IoT, Wi-Fi, and millimeter-wave.
Fog Intelligence for Network Anomaly DetectionKai Yang, Hui Ma, Shaoyu Dou2025-05-27下载Anomalies are common in network system monitoring. When manifested as network threats to be mitigated, service outages to be prevented, and security risks to be ameliorated, detecting such anomalous n...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Lazarus Group Targets Crypto-Wallets and Financial Data while employing new TradecraftsAlessio Di Santo2025-05-27下载This report presents a comprehensive analysis of a malicious software sample, detailing its architecture, behavioral characteristics, and underlying intent.
Diagnosing and Resolving Cloud Platform Instability with Multi-modal RAG LLMsYifan Wang, Kenneth P. Birman2025-05-27下载Today's cloud-hosted applications and services are complex systems, and a performance or functional instability can have dozens or hundreds of potential root causes.

cs.PF - Performance

标题作者发布日期PDF摘要
Power-Capping Metric Evaluation for Improving Energy Efficiency in HPC ApplicationsMaria Patrou, Thomas Wang, Wael Elwasif, Markus Eisenbach, Ross Miller, William Godoy, Oscar Hernandez2025-05-27下载With high-performance computing systems now running at exascale, optimizing power-scaling management and resource utilization has become more critical than ever.
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token RoutingTianyu Fu, Yi Ge, Yichen You, Enshu Liu, Zhihang Yuan, Guohao Dai, Shengen Yan, Huazhong Yang, Yu Wang2025-05-27下载Large Language Models (LLMs) achieve impressive reasoning capabilities at the cost of substantial inference overhead, posing substantial deployment challenges.
Scheduling with Uncertain Holding Costs and its Application to Content ModerationCaner Gocmen, Thodoris Lykouris, Deeksha Sinha, Wentao Weng2025-05-27下载In content moderation for social media platforms, the cost of delaying the review of a content is proportional to its view trajectory, which fluctuates and is apriori unknown.
Constructive community race: full-density spiking neural network model drives neuromorphic computingJohanna Senk, Anno C. Kurth, Steve Furber, Tobias Gemmeke, Bruno Golosio, Arne Heittmann, James C. Knight, Eric Müller, Tobias Noll, Thomas Nowotny, Gorka Peraza Coppola, Luca Peres, Oliver Rhodes, Andrew Rowley, Johannes Schemmel, Tim Stadtmann, Tom Tetzlaff, Gianmarco Tiddia, Sacha J. van Albada, José Villamar, Markus Diesmann2025-05-27下载The local circuitry of the mammalian brain is a focus of the search for generic computational principles because it is largely conserved across species and modalities.

基于 VitePress 构建