Appearance
2025-01-24
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Dynamic Loop Fusion in High-Level Synthesis | Robert Szafarczyk, Syed Waqar Nabi, Wim Vanderbauwhede | 2025-01-24 | 下载 | Dynamic High-Level Synthesis (HLS) uses additional hardware to perform memory disambiguation at runtime, increasing loop throughput in irregular codes compared to static HLS. |
| Towards a Cryogenic CMOS-Memristor Neural Decoder for Quantum Error Correction | Pierre-Antoine Mouny, Maher Benhouria, Victor Yon, Patrick Dufour, Linxiang Huang, Yann Beilliard, Sophie Rochette, Dominique Drouin, Pooya Ronagh | 2025-01-24 | 下载 | This paper presents a novel approach utilizing a scalable neural decoder application-specific integrated circuit (ASIC) based on metal oxide memristors in a 180nm CMOS technology. |
| BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference | Van Thien Nguyen, William Guicquero, Gilles Sicard | 2025-01-24 | 下载 | Long Short-Term Memory (LSTM) and 3D convolution (Conv3D) show impressive results for many video-based applications but require large memory and intensive computing. |
| TCDM Burst Access: Breaking the Bandwidth Barrier in Shared-L1 RVV Clusters Beyond 1000 FPUs | Diyou Shen, Yichao Zhang, Marco Bertuletti, Luca Benini | 2025-01-24 | 下载 | As computing demand and memory footprint of deep learning applications accelerate, clusters of cores sharing local (L1) multi-banked memory are widely used as key building blocks in large-scale archit... |
| Securing DRAM at Scale: ARFM-Driven Row Hammer Defense with Unveiling the Threat of Short tRC Patterns | Nogeun Joo, Donghyuk Kim, Hyunjun Cho, Junseok Noh, Dongha Jung, Joo-Young Kim | 2025-01-24 | 下载 | To address the issue of powerful row hammer (RH) attacks, our study involved an extensive analysis of the prevalent attack patterns in the field. |
| Analysis of heralded higher-fidelity two-qubit entangling gates with self-correction | Yuan Sun | 2025-01-24 | 下载 | For the quantum error correction (QEC) and noisy intermediate-scale quantum (NISQ) algorithms to function with high efficiency, the raw fidelity of quantum logic gates on physical qubits needs to sati... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Pod: An Optimal-Latency, Censorship-Free, and Accountable Generalized Consensus Layer | Orestis Alpos, Bernardo David, Jakov Mitrovski, Odysseas Sofikitis, Dionysis Zindros | 2025-01-24 | 下载 | This work addresses the inherent issues of high latency in blockchains and low scalability in traditional consensus protocols. We present pod, a novel notion of consensus whose first priority is to ac... |
| Federated Domain Generalization with Data-free On-server Matching Gradient | Trong-Binh Nguyen, Minh-Duong Nguyen, Jinsun Park, Quoc-Viet Pham, Won Joo Hwang | 2025-01-24 | 下载 | Domain Generalization (DG) aims to learn from multiple known source domains a model that can generalize well to unknown target domains. One of the key approaches in DG is training an encoder which gen... |
| Optimizing Privacy-Utility Trade-off in Decentralized Learning with Generalized Correlated Noise | Angelo Rodio, Zheng Chen, Erik G. Larsson | 2025-01-24 | 下载 | Decentralized learning enables distributed agents to collaboratively train a shared machine learning model without a central server, through local computation and peer-to-peer communication. |
| A sandbox study proposal for private and distributed health data analysis | Rickard Brännvall, Hanna Svensson, Kannaki Kaliyaperumal, Håkan Burden, Susanne Stenberg | 2025-01-24 | 下载 | This paper presents a sandbox study proposal focused on the distributed processing of personal health data within the Vinnova-funded SARDIN project. |
| A Survey of Optimization Methods for Training DL Models: Theoretical Perspective on Convergence and Generalization | Jing Wang, Anna Choromanska | 2025-01-24 | 下载 | As data sets grow in size and complexity, it is becoming more difficult to pull useful features from them using hand-crafted feature extractors. |
| Experimentally Evaluating the Resource Efficiency of Big Data Autoscaling | Jonathan Will, Nico Treide, Lauritz Thamsen, Odej Kao | 2025-01-24 | 下载 | Distributed dataflow systems like Spark and Flink enable data-parallel processing of large datasets on clusters. Yet, selecting appropriate computational resources for dataflow jobs is often challengi... |
| Data-efficient Performance Modeling via Pre-training | Chunting Liu, Riyadh Baghdadi | 2025-01-24 | 下载 | Performance models are essential for automatic code optimization, enabling compilers to predict the effects of code transformations on performance and guide search for optimal transformations. |
| Thunderdome: Timelock-Free Rationally-Secure Virtual Channels | Zeta Avarikioti, Yuheng Wang, Yuyi Wang | 2025-01-24 | 下载 | Payment channel networks (PCNs) offer a promising solution to address the limited transaction throughput of deployed blockchains. However, several attacks have recently been proposed that stress the v... |
| DeepServe: Serverless Large Language Model Serving at Scale | Junhao Hu, Jiang Xu, Zhixia Liu, Yulong He, Yuetao Chen, Hao Xu, Jiang Liu, Jie Meng, Baoquan Zhang, Shining Wan, Gengyuan Dan, Zhiyu Dong, Zhihao Ren, Changhong Liu, Tao Xie, Dayun Lin, Qin Zhang, Yue Yu, Hao Feng, Xusheng Chen, Yizhou Shan | 2025-01-24 | 下载 | In this paper, we propose DEEPSERVE, a scalable and serverless AI platform designed to efficiently serve large language models (LLMs) at scale in cloud environments. |
| Adaptive Rank Allocation for Federated Parameter-Efficient Fine-Tuning of Language Models | Fei Wu, Jia Hu, Geyong Min, Shiqiang Wang | 2025-01-24 | 下载 | Pre-trained Language Models (PLMs) have demonstrated their superiority and versatility in modern Natural Language Processing (NLP), effectively adapting to various downstream tasks through further fin... |
| TCDM Burst Access: Breaking the Bandwidth Barrier in Shared-L1 RVV Clusters Beyond 1000 FPUs | Diyou Shen, Yichao Zhang, Marco Bertuletti, Luca Benini | 2025-01-24 | 下载 | As computing demand and memory footprint of deep learning applications accelerate, clusters of cores sharing local (L1) multi-banked memory are widely used as key building blocks in large-scale archit... |
| STRIELAD -- A Scalable Toolkit for Real-time Interactive Exploration of Large Atmospheric Datasets | Simon Schneegans, Lori Neary, Markus Flatken, Andreas Gerndt | 2025-01-24 | 下载 | Technological advances in high performance computing and maturing physical models allow scientists to simulate weather and climate evolutions with an increasing accuracy. |
| RadiK: Scalable and Optimized GPU-Parallel Radix Top-K Selection | Yifei Li, Bole Zhou, Jiejing Zhang, Xuechao Wei, Yinghan Li, Yingda Chen | 2025-01-24 | 下载 | Top-k selection, which identifies the largest or smallest k elements from a data set, is a fundamental operation in data-intensive domains such as databases and deep learning, so its scalability and e... |
| Locality-aware Fair Scheduling in LLM Serving | Shiyi Cao, Yichuan Wang, Ziming Mao, Pin-Lun Hsu, Liangsheng Yin, Tian Xia, Dacheng Li, Shu Liu, Yineng Zhang, Yang Zhou, Ying Sheng, Joseph Gonzalez, Ion Stoica | 2025-01-24 | 下载 | Large language model (LLM) inference workload dominates a wide variety of modern AI applications, ranging from multi-turn conversation to document analysis. |
| Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models | Yile Gu, Yifan Xiong, Jonathan Mace, Yuting Jiang, Yigong Hu, Baris Kasikci, Peng Cheng | 2025-01-24 | 下载 | Observability in cloud infrastructure is critical for service providers, driving the widespread adoption of anomaly detection systems for monitoring metrics. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Enhanced AI as a Service at the Edge via Transformer Network | Vahid Pourakbar, Hamed Shah-Mansouri | 2025-01-24 | 下载 | Artificial intelligence (AI) has become a pivotal force in reshaping next generation mobile networks. Edge computing holds promise in enabling AI as a service (AIaaS) for prompt decision-making by off... |
| Performance Evaluation of Satellite-Based Data Offloading on Starlink Constellations | Alexander Bonora, Alessandro Traspadini, Marco Giordani, Michele Zorzi | 2025-01-24 | 下载 | Vehicular Edge Computing (VEC) is a key research area in autonomous driving. As Intelligent Transportation Systems (ITSs) continue to expand, ground vehicles (GVs) face the challenge of handling huge ... |
| An Attentive Graph Agent for Topology-Adaptive Cyber Defence | Ilya Orson Sandoval, Isaac Symes Thompson, Vasilios Mavroudis, Chris Hicks | 2025-01-24 | 下载 | As cyber threats grow increasingly sophisticated, reinforcement learning (RL) is emerging as a promising technique to create intelligent and adaptive cyber defense systems. |
| COMIX: Generalized Conflict Management in O-RAN xApps -- Architecture, Workflow, and a Power Control case | Anastasios Giannopoulos, Sotirios Spantideas, Levis George, Kalafatelis Alexandros, Panagiotis Trakadas | 2025-01-24 | 下载 | Open Radio Access Network (O-RAN) is transforming the telecommunications landscape by enabling flexible, intelligent, and multi-vendor networks. |
| 6G Cellular Networks: Mapping the Landscape for the IMT-2030 Framework | Ekram Hossain, Angelo Vera-Rivera | 2025-01-24 | 下载 | The IMT-2030 framework provides the vision and conceptual foundation for the next-generation of mobile broadband systems, colloquially known as Sixth-Generation (6G) cellular networks. |
| Path Loss Modelling for UAV Communications in Urban Scenarios with Random Obstacles | Abdul Saboor, Zhuangzhuang Cui, Evgenii Vinogradov, Sofie Pollin | 2025-01-24 | 下载 | Path Loss (PL) is vital to evaluate the performance of Unmanned Aerial Vehicles (UAVs) as Aerial Base Stations (ABSs), particularly in urban environments with complex propagation due to various obstac... |
| Adaptive Rank Allocation for Federated Parameter-Efficient Fine-Tuning of Language Models | Fei Wu, Jia Hu, Geyong Min, Shiqiang Wang | 2025-01-24 | 下载 | Pre-trained Language Models (PLMs) have demonstrated their superiority and versatility in modern Natural Language Processing (NLP), effectively adapting to various downstream tasks through further fin... |
| Empirical Line-of-Sight Probability Modeling for UAVs in Random Urban Layouts | Abdul Saboor, Zhuangzhuang Cui, Evgenii Vinogradov, Sofie Pollin | 2025-01-24 | 下载 | Accurate Probability of Line-of-Sight (PLoS) modeling is important in evaluating the performance of Unmanned Aerial Vehicle (UAV)-based communication systems in urban environments, where real-time com... |
| Application-Aware Resource Allocation and Data Management for MEC-assisted IoT Service Providers | Simone Bolettieri, Raffaele Bruno, Enzo Mingozzi | 2025-01-24 | 下载 | To support the growing demand for data-intensive and low-latency IoT applications, Multi-Access Edge Computing (MEC) is emerging as an effective edge-computing approach enabling the execution of delay... |
| Joint System Latency and Data Freshness Optimization for Cache-enabled Mobile Crowdsensing Networks | Kexin Shi, Yaru Fu, Yongna Guo, Fu Lee Wang, Yan Zhang | 2025-01-24 | 下载 | Mobile crowdsensing (MCS) networks enable large-scale data collection by leveraging the ubiquity of mobile devices. However, frequent sensing and data transmission can lead to significant resource con... |
| Serving Long-Context LLMs at the Mobile Edge: Test-Time Reinforcement Learning-based Model Caching and Inference Offloading | Minrui Xu, Dusit Niyato, Christopher G. Brinton | 2025-01-24 | 下载 | Large Language Models (LLMs) can perform zero-shot learning on unseen tasks and few-shot learning on complex reasoning tasks. However, resource-limited mobile edge networks struggle to support long-co... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Profiling Apple Silicon Performance for ML Training | Dahua Feng, Zhiming Xu, Rongxiang Wang, Felix Xiaozhu Lin | 2025-01-24 | 下载 | Apple Silicon has attracted much attention for its performance and role in machine learning (ML) training. Unlike NVIDIA GPUs, which have traditionally dominated ML training, Apple Silicon has a signi... |