Appearance
2025-08-07
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Accelerating Data Chunking in Deduplication Systems using Vector Instructions | Sreeharsha Udayashankar, Abdelrahman Baba, Samer Al-Kiswany | 2025-08-07 | 下载 | Content-defined Chunking (CDC) algorithms dictate the overall space savings that deduplication systems achieve. However, due to their need to scan each file in its entirety, they are slow and often th... |
| ConiQ: Enabling Concatenated Quantum Error Correction on Neutral Atom Arrays | Pengyu Liu, Mingkuan Xu, Hengyun Zhou, Hanrui Wang, Umut A. Acar, Yunong Shi | 2025-08-07 | 下载 | Recent progress on concatenated codes, especially many-hypercube codes, achieves unprecedented space efficiency. Yet two critical challenges persist in practice. |
| relOBI: A Reliable Low-latency Interconnect for Tightly-Coupled On-chip Communication | Michael Rogenmoser, Angelo Garofalo, Luca Benini | 2025-08-07 | 下载 | On-chip communication is a critical element of modern systems-on-chip (SoCs), allowing processor cores to interact with memory and peripherals. |
| Understanding and Mitigating Errors of LLM-Generated RTL Code | Jiazheng Zhang, Cheng Liu, Long Cheng, Xiaowei Li, Huawei Li | 2025-08-07 | 下载 | Despite limited success in large language model (LLM)-based register-transfer-level (RTL) code generation, the root causes of errors remain poorly understood. |
| TMA-Adaptive FP8 Grouped GEMM: Eliminating Padding Requirements in Low-Precision Training and Inference on Hopper | Zhongling Su, Rong Fu, Weihan Cao, Jianfei Gao, Minxi Jin, Zhilin Pei, Hui Wang | 2025-08-07 | 下载 | Current FP8 grouped GEMM implementations require padding each group to a fixed alignment (e.g., 128), incurring memory and computational overhead. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Snowpark: Performant, Secure, User-Friendly Data Engineering and AI/ML Next To Your Data | Brandon Baker, Elliott Brossard, Chenwei Xie, Zihao Ye, Deen Liu, Yijun Xie, Arthur Zwiegincew, Nitya Kumar Sharma, Gaurav Jain, Eugene Retunsky, Mike Halcrow, Derek Denny-Brown, Istvan Cseri, Tyler Akidau, Yuxiong He | 2025-08-07 | 下载 | Snowflake revolutionized data analytics with an elastic architecture that decouples compute and storage, enabling scalable solutions supporting data architectures like data lake, data warehouse, data ... |
| A Dynamic Approach to Load Balancing in Cloud Infrastructure: Enhancing Energy Efficiency and Resource Utilization | Shadman Sakib, Ajay Katangur, Rahul Dubey | 2025-08-07 | 下载 | Cloud computing has grown rapidly in recent years, mainly due to the sharp increase in data transferred over the internet. This growth makes load balancing a key part of cloud systems, as it helps dis... |
| Accelerating Data Chunking in Deduplication Systems using Vector Instructions | Sreeharsha Udayashankar, Abdelrahman Baba, Samer Al-Kiswany | 2025-08-07 | 下载 | Content-defined Chunking (CDC) algorithms dictate the overall space savings that deduplication systems achieve. However, due to their need to scan each file in its entirety, they are slow and often th... |
| X-VFL: A New Vertical Federated Learning Framework with Cross Completion and Decision Subspace Alignment | Qinghua Yao, Xiangrui Xu, Zhize Li | 2025-08-07 | 下载 | Vertical Federated Learning (VFL) enables collaborative learning by integrating disjoint feature subsets from multiple clients/parties. However, VFL typically faces two key challenges: i) the requirem... |
| Modular Architecture for High-Performance and Low Overhead Data Transfers | Rasman Mubtasim Swargo, Engin Arslan, Md Arifuzzaman | 2025-08-07 | 下载 | High-performance applications necessitate rapid and dependable transfer of massive datasets across geographically dispersed locations. Traditional file transfer tools often suffer from resource underu... |
| Adaptive Parallel Downloader for Large Genomic Datasets | Rasman Mubtasim Swargo, Engin Arslan, Md Arifuzzaman | 2025-08-07 | 下载 | Modern next-generation sequencing (NGS) projects routinely generate terabytes of data, which researchers commonly download from public repositories such as SRA or ENA. |
| A Feature Engineering Approach for Business Impact-Oriented Failure Detection in Distributed Instant Payment Systems | Lorenzo Porcelli | 2025-08-07 | 下载 | Instant payment infrastructures have stringent performance requirements, processing millions of transactions daily with zero-downtime expectations. |
| Simulating LLM training workloads for heterogeneous compute and network infrastructure | Sumit Kumar, Arjun Temura, Naman Sharma, Ramanjeet Singh, Meet Dadhania, Praveen Tammana, Satananda Burla, Abed Mohammad Kamaluddin, Rinku Shah | 2025-08-07 | 下载 | The growing demand for large-scale GPU clusters in distributed model training presents a significant barrier to innovation, particularly in model optimization, performance tuning, and system-level enh... |
| HFedATM: Hierarchical Federated Domain Generalization via Optimal Transport and Regularized Mean Aggregation | Thinh Nguyen, Trung Phan, Binh T. Nguyen, Khoa D Doan, Kok-Seng Wong | 2025-08-07 | 下载 | Federated Learning (FL) is a decentralized approach where multiple clients collaboratively train a shared global model without sharing their raw data. |
| Theseus: A Distributed and Scalable GPU-Accelerated Query Processing Platform Optimized for Efficient Data Movement | Felipe Aramburú, William Malpica, Kaouther Abrougui, Amin Aramoon, Romulo Auccapuclla, Claude Brisson, Matthijs Brobbel, Colby Farrell, Pradeep Garigipati, Joost Hoozemans, Supun Kamburugamuve, Akhil Nair, Alexander Ocsa, Johan Peltenburg, Rubén Quesada López, Deepak Sihag, Ahmet Uyar, Dhruv Vats, Michael Wendt, Jignesh M. Patel, Rodrigo Aramburú | 2025-08-07 | 下载 | Online analytical processing of queries on datasets in the many-terabyte range is only possible with costly distributed computing systems. To decrease the cost and increase the throughput, systems can... |
| Task-Based Programming for Adaptive Mesh Refinement in Compressible Flow Simulations | Anjiang Wei, Hang Song, Mert Hidayetoglu, Elliott Slaughter, Sanjiva K. Lele, Alex Aiken | 2025-08-07 | 下载 | High-order solvers for compressible flows are vital in scientific applications. Adaptive mesh refinement (AMR) is a key technique for reducing computational cost by concentrating resolution in regions... |
| Tesserae: Scalable Placement Policies for Deep Learning Workloads | Song Bian, Saurabh Agarwal, Md. Tareq Mahmood, Shivaram Venkataraman | 2025-08-07 | 下载 | Training deep learning (DL) models has become a dominant workload in data-centers and improving resource utilization is a key goal of DL cluster schedulers. |
| Managing, Analyzing and Sharing Research Data with Gen3 Data Commons | Craig Barnes, Kyle Burton, Michael S. Fitzsimons, Hara Prasad Juvvala, Brienna Larrick, Christopher Meyer, Pauline Ribeyre, Ao Liu, Clint Malson, Noah Metoki-Shlubsky, Andrii Prokhorenkov, Jawad Qureshi, Radhika Reddy, L. Philip Schumm, Mingfei Shao, Trevar Simmons, Alexander VanTol, Peter Vassilatos, Aarti Venkat, Robert L. Grossman | 2025-08-07 | 下载 | Gen3 is an open-source data platform for building data commons. A data commons is a cloud-based data platform for managing, analyzing, and sharing data with a research community. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| HiSTM: Hierarchical Spatiotemporal Mamba for Cellular Traffic Forecasting | Zineddine Bettouche, Khalid Ali, Andreas Fischer, Andreas Kassler | 2025-08-07 | 下载 | Cellular traffic forecasting is essential for network planning, resource allocation, or load-balancing traffic across cells. However, accurate forecasting is difficult due to intricate spatial and tem... |
| Modular Design and Experimental Evaluation of 5G Mobile Cell Architectures Based on Overlay and Integrated Models | José Ruela, Ivan Cojocaru, André Coelho, Rui Campos, Manuel Ricardo | 2025-08-07 | 下载 | This paper presents the concept, architectural design, and performance evaluation of a 5G Mobile Cell (MC) used to provide 5G wireless connectivity to User Equipment (UE) in areas with limited fixed 5... |
| TeraRIS NOMA-MIMO Communications for 6G and Beyond Industrial Networks | Ali Raza, Muhammad Farhan Khan, Zeeshan Alam, Muhammad Saad, Ilyas Saleem, Muhammad Ahmed Mohsin, Muhammad Ali Jamshed | 2025-08-07 | 下载 | This paper presents a joint framework that integrates reconfigurable intelligent surfaces (RISs) with Terahertz (THz) communications and non-orthogonal multiple access (NOMA) to enhance smart industri... |
| A Design for an Early Quantum Network | Yuan Li, Chen Zhang, Hao Zhang, Tao Huang, Yunjie Liu | 2025-08-07 | 下载 | With the rapid advancement of quantum information technology, quantum networks have become essential for supporting diverse applications, which often have stringent demands for key metrics such as fid... |
| Camel: Energy-Aware LLM Inference on Resource-Constrained Devices | Hao Xu, Long Peng, Shezheng Song, Xiaodong Liu, Ma Jun, Shasha Li, Jie Yu, Xiaoguang Mao | 2025-08-07 | 下载 | Most Large Language Models (LLMs) are currently deployed in the cloud, with users relying on internet connectivity for access. However, this paradigm faces challenges such as network latency, privacy ... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Back to Bits: Extending Shannon's communication performance framework to computing | Max Hawkins, Richard Vuduc | 2025-08-07 | 下载 | This work proposes a novel computing performance unit grounded in information theory. Modern computing systems are increasingly diverse, supporting low-precision formats, hardware specialization, and ... |
| Dancing with a Robot: An Experimental Study of Child-Robot Interaction in a Performative Art Setting | Victor Ngo, Rachel, Ramchurn, Roma Patel, Alan Chamberlain, Ayse Kucukyilmaz | 2025-08-07 | 下载 | This paper presents an evaluation of 18 children's in-the-wild experiences with the autonomous robot arm performer NED (Never-Ending Dancer) within the Thingamabobas installation, showcased across the... |
| CRAM: Large-scale Video Continual Learning with Bootstrapped Compression | Shivani Mall, Joao F. Henriques | 2025-08-07 | 下载 | Continual learning (CL) promises to allow neural networks to learn from continuous streams of inputs, instead of IID (independent and identically distributed) sampling, which requires random access to... |