2025-01-24

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Dynamic Loop Fusion in High-Level Synthesis	Robert Szafarczyk, Syed Waqar Nabi, Wim Vanderbauwhede	2025-01-24	下载	Dynamic High-Level Synthesis (HLS) uses additional hardware to perform memory disambiguation at runtime, increasing loop throughput in irregular codes compared to static HLS.
Towards a Cryogenic CMOS-Memristor Neural Decoder for Quantum Error Correction	Pierre-Antoine Mouny, Maher Benhouria, Victor Yon, Patrick Dufour, Linxiang Huang, Yann Beilliard, Sophie Rochette, Dominique Drouin, Pooya Ronagh	2025-01-24	下载	This paper presents a novel approach utilizing a scalable neural decoder application-specific integrated circuit (ASIC) based on metal oxide memristors in a 180nm CMOS technology.
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference	Van Thien Nguyen, William Guicquero, Gilles Sicard	2025-01-24	下载	Long Short-Term Memory (LSTM) and 3D convolution (Conv3D) show impressive results for many video-based applications but require large memory and intensive computing.
TCDM Burst Access: Breaking the Bandwidth Barrier in Shared-L1 RVV Clusters Beyond 1000 FPUs	Diyou Shen, Yichao Zhang, Marco Bertuletti, Luca Benini	2025-01-24	下载	As computing demand and memory footprint of deep learning applications accelerate, clusters of cores sharing local (L1) multi-banked memory are widely used as key building blocks in large-scale archit...
Securing DRAM at Scale: ARFM-Driven Row Hammer Defense with Unveiling the Threat of Short tRC Patterns	Nogeun Joo, Donghyuk Kim, Hyunjun Cho, Junseok Noh, Dongha Jung, Joo-Young Kim	2025-01-24	下载	To address the issue of powerful row hammer (RH) attacks, our study involved an extensive analysis of the prevalent attack patterns in the field.
Analysis of heralded higher-fidelity two-qubit entangling gates with self-correction	Yuan Sun	2025-01-24	下载	For the quantum error correction (QEC) and noisy intermediate-scale quantum (NISQ) algorithms to function with high efficiency, the raw fidelity of quantum logic gates on physical qubits needs to sati...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Pod: An Optimal-Latency, Censorship-Free, and Accountable Generalized Consensus Layer	Orestis Alpos, Bernardo David, Jakov Mitrovski, Odysseas Sofikitis, Dionysis Zindros	2025-01-24	下载	This work addresses the inherent issues of high latency in blockchains and low scalability in traditional consensus protocols. We present pod, a novel notion of consensus whose first priority is to ac...
Federated Domain Generalization with Data-free On-server Matching Gradient	Trong-Binh Nguyen, Minh-Duong Nguyen, Jinsun Park, Quoc-Viet Pham, Won Joo Hwang	2025-01-24	下载	Domain Generalization (DG) aims to learn from multiple known source domains a model that can generalize well to unknown target domains. One of the key approaches in DG is training an encoder which gen...
Optimizing Privacy-Utility Trade-off in Decentralized Learning with Generalized Correlated Noise	Angelo Rodio, Zheng Chen, Erik G. Larsson	2025-01-24	下载	Decentralized learning enables distributed agents to collaboratively train a shared machine learning model without a central server, through local computation and peer-to-peer communication.
A sandbox study proposal for private and distributed health data analysis	Rickard Brännvall, Hanna Svensson, Kannaki Kaliyaperumal, Håkan Burden, Susanne Stenberg	2025-01-24	下载	This paper presents a sandbox study proposal focused on the distributed processing of personal health data within the Vinnova-funded SARDIN project.
A Survey of Optimization Methods for Training DL Models: Theoretical Perspective on Convergence and Generalization	Jing Wang, Anna Choromanska	2025-01-24	下载	As data sets grow in size and complexity, it is becoming more difficult to pull useful features from them using hand-crafted feature extractors.
Experimentally Evaluating the Resource Efficiency of Big Data Autoscaling	Jonathan Will, Nico Treide, Lauritz Thamsen, Odej Kao	2025-01-24	下载	Distributed dataflow systems like Spark and Flink enable data-parallel processing of large datasets on clusters. Yet, selecting appropriate computational resources for dataflow jobs is often challengi...
Data-efficient Performance Modeling via Pre-training	Chunting Liu, Riyadh Baghdadi	2025-01-24	下载	Performance models are essential for automatic code optimization, enabling compilers to predict the effects of code transformations on performance and guide search for optimal transformations.
Thunderdome: Timelock-Free Rationally-Secure Virtual Channels	Zeta Avarikioti, Yuheng Wang, Yuyi Wang	2025-01-24	下载	Payment channel networks (PCNs) offer a promising solution to address the limited transaction throughput of deployed blockchains. However, several attacks have recently been proposed that stress the v...
DeepServe: Serverless Large Language Model Serving at Scale	Junhao Hu, Jiang Xu, Zhixia Liu, Yulong He, Yuetao Chen, Hao Xu, Jiang Liu, Jie Meng, Baoquan Zhang, Shining Wan, Gengyuan Dan, Zhiyu Dong, Zhihao Ren, Changhong Liu, Tao Xie, Dayun Lin, Qin Zhang, Yue Yu, Hao Feng, Xusheng Chen, Yizhou Shan	2025-01-24	下载	In this paper, we propose DEEPSERVE, a scalable and serverless AI platform designed to efficiently serve large language models (LLMs) at scale in cloud environments.
Adaptive Rank Allocation for Federated Parameter-Efficient Fine-Tuning of Language Models	Fei Wu, Jia Hu, Geyong Min, Shiqiang Wang	2025-01-24	下载	Pre-trained Language Models (PLMs) have demonstrated their superiority and versatility in modern Natural Language Processing (NLP), effectively adapting to various downstream tasks through further fin...
TCDM Burst Access: Breaking the Bandwidth Barrier in Shared-L1 RVV Clusters Beyond 1000 FPUs	Diyou Shen, Yichao Zhang, Marco Bertuletti, Luca Benini	2025-01-24	下载	As computing demand and memory footprint of deep learning applications accelerate, clusters of cores sharing local (L1) multi-banked memory are widely used as key building blocks in large-scale archit...
STRIELAD -- A Scalable Toolkit for Real-time Interactive Exploration of Large Atmospheric Datasets	Simon Schneegans, Lori Neary, Markus Flatken, Andreas Gerndt	2025-01-24	下载	Technological advances in high performance computing and maturing physical models allow scientists to simulate weather and climate evolutions with an increasing accuracy.
RadiK: Scalable and Optimized GPU-Parallel Radix Top-K Selection	Yifei Li, Bole Zhou, Jiejing Zhang, Xuechao Wei, Yinghan Li, Yingda Chen	2025-01-24	下载	Top-k selection, which identifies the largest or smallest k elements from a data set, is a fundamental operation in data-intensive domains such as databases and deep learning, so its scalability and e...
Locality-aware Fair Scheduling in LLM Serving	Shiyi Cao, Yichuan Wang, Ziming Mao, Pin-Lun Hsu, Liangsheng Yin, Tian Xia, Dacheng Li, Shu Liu, Yineng Zhang, Yang Zhou, Ying Sheng, Joseph Gonzalez, Ion Stoica	2025-01-24	下载	Large language model (LLM) inference workload dominates a wide variety of modern AI applications, ranging from multi-turn conversation to document analysis.
Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models	Yile Gu, Yifan Xiong, Jonathan Mace, Yuting Jiang, Yigong Hu, Baris Kasikci, Peng Cheng	2025-01-24	下载	Observability in cloud infrastructure is critical for service providers, driving the widespread adoption of anomaly detection systems for monitoring metrics.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Enhanced AI as a Service at the Edge via Transformer Network	Vahid Pourakbar, Hamed Shah-Mansouri	2025-01-24	下载	Artificial intelligence (AI) has become a pivotal force in reshaping next generation mobile networks. Edge computing holds promise in enabling AI as a service (AIaaS) for prompt decision-making by off...
Performance Evaluation of Satellite-Based Data Offloading on Starlink Constellations	Alexander Bonora, Alessandro Traspadini, Marco Giordani, Michele Zorzi	2025-01-24	下载	Vehicular Edge Computing (VEC) is a key research area in autonomous driving. As Intelligent Transportation Systems (ITSs) continue to expand, ground vehicles (GVs) face the challenge of handling huge ...
An Attentive Graph Agent for Topology-Adaptive Cyber Defence	Ilya Orson Sandoval, Isaac Symes Thompson, Vasilios Mavroudis, Chris Hicks	2025-01-24	下载	As cyber threats grow increasingly sophisticated, reinforcement learning (RL) is emerging as a promising technique to create intelligent and adaptive cyber defense systems.
COMIX: Generalized Conflict Management in O-RAN xApps -- Architecture, Workflow, and a Power Control case	Anastasios Giannopoulos, Sotirios Spantideas, Levis George, Kalafatelis Alexandros, Panagiotis Trakadas	2025-01-24	下载	Open Radio Access Network (O-RAN) is transforming the telecommunications landscape by enabling flexible, intelligent, and multi-vendor networks.
6G Cellular Networks: Mapping the Landscape for the IMT-2030 Framework	Ekram Hossain, Angelo Vera-Rivera	2025-01-24	下载	The IMT-2030 framework provides the vision and conceptual foundation for the next-generation of mobile broadband systems, colloquially known as Sixth-Generation (6G) cellular networks.
Path Loss Modelling for UAV Communications in Urban Scenarios with Random Obstacles	Abdul Saboor, Zhuangzhuang Cui, Evgenii Vinogradov, Sofie Pollin	2025-01-24	下载	Path Loss (PL) is vital to evaluate the performance of Unmanned Aerial Vehicles (UAVs) as Aerial Base Stations (ABSs), particularly in urban environments with complex propagation due to various obstac...
Adaptive Rank Allocation for Federated Parameter-Efficient Fine-Tuning of Language Models	Fei Wu, Jia Hu, Geyong Min, Shiqiang Wang	2025-01-24	下载	Pre-trained Language Models (PLMs) have demonstrated their superiority and versatility in modern Natural Language Processing (NLP), effectively adapting to various downstream tasks through further fin...
Empirical Line-of-Sight Probability Modeling for UAVs in Random Urban Layouts	Abdul Saboor, Zhuangzhuang Cui, Evgenii Vinogradov, Sofie Pollin	2025-01-24	下载	Accurate Probability of Line-of-Sight (PLoS) modeling is important in evaluating the performance of Unmanned Aerial Vehicle (UAV)-based communication systems in urban environments, where real-time com...
Application-Aware Resource Allocation and Data Management for MEC-assisted IoT Service Providers	Simone Bolettieri, Raffaele Bruno, Enzo Mingozzi	2025-01-24	下载	To support the growing demand for data-intensive and low-latency IoT applications, Multi-Access Edge Computing (MEC) is emerging as an effective edge-computing approach enabling the execution of delay...
Joint System Latency and Data Freshness Optimization for Cache-enabled Mobile Crowdsensing Networks	Kexin Shi, Yaru Fu, Yongna Guo, Fu Lee Wang, Yan Zhang	2025-01-24	下载	Mobile crowdsensing (MCS) networks enable large-scale data collection by leveraging the ubiquity of mobile devices. However, frequent sensing and data transmission can lead to significant resource con...
Serving Long-Context LLMs at the Mobile Edge: Test-Time Reinforcement Learning-based Model Caching and Inference Offloading	Minrui Xu, Dusit Niyato, Christopher G. Brinton	2025-01-24	下载	Large Language Models (LLMs) can perform zero-shot learning on unseen tasks and few-shot learning on complex reasoning tasks. However, resource-limited mobile edge networks struggle to support long-co...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Profiling Apple Silicon Performance for ML Training	Dahua Feng, Zhiming Xu, Rongxiang Wang, Felix Xiaozhu Lin	2025-01-24	下载	Apple Silicon has attracted much attention for its performance and role in machine learning (ML) training. Unlike NVIDIA GPUs, which have traditionally dominated ML training, Apple Silicon has a signi...