Appearance
2025-02-27
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Wildcat: Educational RISC-V Microprocessors | Martin Schoeberl | 2025-02-27 | 下载 | In computer architecture courses, we usually teach RISC processors using a five-stage pipeline, neglecting alternative organizations. This design choice, rooted in the 1980s technology, may not be opt... |
| HaLoRA: Hardware-aware Low-Rank Adaptation for Large Language Models Based on Hybrid Compute-in-Memory Architecture | Taiqiang Wu, Chenchen Ding, Wenyong Zhou, Yuxin Cheng, Xincheng Feng, Shuqi Wang, Wendong Xu, Chufan Shi, Zhengwu Liu, Ngai Wong | 2025-02-27 | 下载 | Low-rank adaptation (LoRA) is a predominant parameter-efficient finetuning method for adapting large language models (LLMs) to downstream tasks. |
| HALO: Hardware-aware quantization with low critical-path-delay weights for LLM acceleration | Rohan Juneja, Shivam Aggarwal, Safeen Huda, Tulika Mitra, Li-Shiuan Peh | 2025-02-27 | 下载 | Quantization is critical for efficiently deploying large language models (LLMs). Yet conventional methods remain hardware-agnostic, limited to bit-width constraints, and do not account for intrinsic c... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Communication-Efficient and Differentially Private Vertical Federated Learning with Zeroth-Order Optimization | Jianing Zhang, Evan Chen, Dong-Jun Han, Chaoyue Liu, Christopher G. Brinton | 2025-02-27 | 下载 | Vertical Federated Learning (VFL) enables collaborative model training across feature-partitioned devices, yet its reliance on device-server information exchange introduces significant communication o... |
| Building a Theory of Distributed Systems: Work by Nancy Lynch and Collaborators | Nancy Lynch | 2025-02-27 | 下载 | In this manuscript I overview my work on developing a Theory for Distributed Systems -- work that has involved many students and other collaborators. |
| Improving the Efficiency of a Deep Reinforcement Learning-Based Power Management System for HPC Clusters Using Curriculum Learning | Thomas Budiarjo, Santana Yuda Pradata, Kadek Gemilang Santiyuda, Muhammad Alfian Amrizal, Reza Pulungan, Hiroyuki Takizawa | 2025-02-27 | 下载 | High energy consumption remains a key challenge in high-performance computing (HPC) systems, which often feature hundreds or thousands of nodes drawing substantial power even in idle or standby modes. |
| Methodology for GPU Frequency Switching Latency Measurement | Daniel Velicka, Ondrej Vysocky, Lubomir Riha | 2025-02-27 | 下载 | The development of exascale and post-exascale HPC and AI systems integrates thousands of CPUs and specialized accelerators, making energy optimization critical as power costs rival hardware expenses. |
| Large-Scale Simulations of Fully Resolved Complex Moving Geometries with Partially Saturated Cells | P. Suffa, S. Kemmler, H. Koestler, U. Ruede | 2025-02-27 | 下载 | We employ the Partially Saturated Cells Method (PSM) to model the interaction between the fluid flow and solid moving objects as an extension to the conventional lattice Boltzmann method. |
| SkipPipe: Partial and Reordered Pipelining Framework for Training LLMs in Heterogeneous Networks | Nikolay Blagoev, Lydia Yiyu Chen, Oğuzhan Ersoy | 2025-02-27 | 下载 | Data and pipeline parallelism are ubiquitous for training of Large Language Models (LLM) on distributed nodes. Driven by the need for cost-effective training, recent work explores efficient communicat... |
| RingAda: Pipelining Large Model Fine-Tuning on Edge Devices with Scheduled Layer Unfreezing | Liang Li, Xiaopei Chen, Wen Wu | 2025-02-27 | 下载 | To enable large model (LM) based edge intelligent service provisioning, on-device fine-tuning with locally personalized data allows for continuous and privacy-preserving LM customization. |
| Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts | Shulai Zhang, Ningxin Zheng, Haibin Lin, Ziheng Jiang, Wenlei Bao, Chengquan Jiang, Qi Hou, Weihao Cui, Size Zheng, Li-Wen Chang, Quan Chen, Xin Liu | 2025-02-27 | 下载 | Mixture-of-experts (MoE) has been extensively employed to scale large language models to trillion-plus parameters while maintaining a fixed computational cost. |
| Static task mapping for heterogeneous systems based on series-parallel decompositions | Martin Wilhelm, Thilo Pionteck | 2025-02-27 | 下载 | Modern heterogeneous systems consist of many different processing units, such as CPUs, GPUs, FPGAs and AI units. A central problem in the design of applications in this environment is to find a benefi... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Scalable Coordinated Learning for H2M/R Applications over Optical Access Networks (Invited) | Sourav Mondal, Elaine Wong | 2025-02-27 | 下载 | One of the primary research interests adhering to next-generation fiber-wireless access networks is human-to-machine/robot (H2M/R) collaborative communications facilitating Industry 5.0. |
| Robust Multicast Origin Authentication in MACsec and CANsec for Automotive Scenarios | Gianluca Cena, Lucia Seno, Stefano Scanzio | 2025-02-27 | 下载 | Having everything interconnected through the Internet, including vehicle onboard systems, is making security a primary concern in the automotive domain as well. |
| Data Taxonomy Towards the Applicability of the Digital Twin Conceptual Framework in Disaster Management | Eva Brucherseifer, Marco Marquard, Martin Hellmann, Andrea Tundis | 2025-02-27 | 下载 | The Digital Twin (DT) offers a novel approach to the management of critical infrastructures, including energy, water, traffic, public health, and communication systems, which are indispensable for the... |
| ACCORD: Application Context-aware Cross-layer Optimization and Resource Design for 5G/NextG Machine-centric Applications | Azuka Chiejina, Subhramoy Mohanti, Vijay K. Shah | 2025-02-27 | 下载 | Recent advancements in AI and edge computing have accelerated the development of machine-centric applications (MCAs), such as smart surveillance systems. |
| Pricing for Routing and Flow-Control in Payment Channel Networks | Suryanarayana Sankagiri, Bruce Hajek | 2025-02-27 | 下载 | A payment channel network is a blockchain-based overlay mechanism that allows parties to transact more efficiently than directly using the blockchain. |
| Energy consumption of smartphones and IoT devices when using different versions of the HTTP protocol | Chiara Caiazza, Valerio Luconi, Alessio Vecchio | 2025-02-27 | 下载 | HTTP is frequently used by smartphones and IoT devices to access information and Web services. Nowadays, HTTP is used in three major versions, each introducing significant changes with respect to the ... |
| Harmonious Coexistence between Aloha and CSMA: Novel Dual-channel Modeling and Throughput Optimization | Wenhai Lin, Xinghua Sun, Anshan Yuan, Yayu Gao | 2025-02-27 | 下载 | The scarcity of the licensed spectrum is forcing emerging Internet of Things (IoT) networks to operate within the unlicensed spectrum. Yet there has been extensive observation indicating that performa... |
| AutoBS: Autonomous Base Station Deployment with Reinforcement Learning and Digital Network Twins | Ju-Hyung Lee, Andreas F. Molisch | 2025-02-27 | 下载 | This paper introduces AutoBS, a reinforcement learning (RL)-based framework for optimal base station (BS) deployment in 6G radio access networks (RAN). |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Entanglement buffering with multiple quantum memories | Álvaro G. Iñesta, Bethany Davies, Sounak Kar, Stephanie Wehner | 2025-02-27 | 下载 | Entanglement buffers are systems that maintain high-quality entanglement, ensuring it is readily available for consumption when needed. In this work, we study the performance of a two-node buffer, whe... |
| A high-performance and portable implementation of the SISSO method for CPUs and GPUs | Sebastian Eibl, Yi Yao, Matthias Scheffler, Markus Rampp, Luca M. Ghiringhelli, Thomas A. R. Purcell | 2025-02-27 | 下载 | SISSO (sure-independence screening and sparsifying operator) is an artificial intelligence (AI) method based on symbolic regression and compressed sensing widely used in materials science research. |