2025-01-21

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Analysis of a Memcapacitor-Based for Neural Network Accelerator Framework	Ankur Singh, Dowon Kim, Byung-Geun Lee	2025-01-21	下载	Data-intensive computing tasks, such as training neural networks, are crucial for artificial intelligence applications but often come with high energy demands.
Pushing the Limits of BFP on Narrow Precision LLM Inference	Hui Wang, Yuan Cheng, Xiaomeng Han, Zhengpeng Zhao, Dawei Yang, Zhe Jiang	2025-01-21	下载	The substantial computational and memory demands of Large Language Models (LLMs) hinder their deployment. Block Floating Point (BFP) has proven effective in accelerating linear operations, a cornersto...
Dissecting the NVIDIA Hopper Architecture through Microbenchmarking and Multiple Level Analysis	Weile Luo, Ruibo Fan, Zeyu Li, Dayou Du, Hongyuan Liu, Qiang Wang, Xiaowen Chu	2025-01-21	下载	This study presents a comprehensive multi-level analysis of the NVIDIA Hopper GPU architecture, focusing on its performance characteristics and novel features.
Accelerating Recommender Model ETL with a Streaming FPGA-GPU Dataflow	Yu Zhu, Wenqi Jiang, Piyumi Jasin Pathiranage, Yongjun He, Gustavo Alonso	2025-01-21	下载	The real-time performance of recommender models depends on the continuous integration of massive volumes of new user interaction data into training pipelines.
A Fully Pipelined FIFO Based Polynomial Multiplication Hardware Architecture Based On Number Theoretic Transform	Moslem Heidarpur, Mitra Mirhassani, Norman Chang	2025-01-21	下载	This paper presents digital hardware for computing polynomial multiplication using Number Theoretic Transform (NTT), specifically designed for implementation on Field Programmable Gate Arrays (FPGAs).
Supervised Learning for Analog and RF Circuit Design: Benchmarks and Comparative Insights	Asal Mehradfar, Xuzhe Zhao, Yue Niu, Sara Babakniya, Mahdi Alesheikh, Hamidreza Aghasi, Salman Avestimehr	2025-01-21	下载	Automating analog and radio-frequency (RF) circuit design using machine learning (ML) significantly reduces the time and effort required for parameter optimization.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
An Empirical Characterization of Outages and Incidents in Public Services for Large Language Models	Xiaoyu Chu, Sacheendra Talluri, Qingxian Lu, Alexandru Iosup	2025-01-21	下载	People and businesses increasingly rely on public LLM services, such as ChatGPT, DALLE, and Claude. Understanding their outages, and particularly measuring their failure-recovery processes, is becomin...
More for Less: Integrating Capability-Predominant and Capacity-Predominant Computing	Zhong Zheng, Michael E. Papka, Zhiling Lan	2025-01-21	下载	Capability jobs (e.g., large, long-running tasks) and capacity jobs (e.g., small, short-running tasks) are two common types of workloads in high-performance computing (HPC).
CYCle: Choosing Your Collaborators Wisely to Enhance Collaborative Fairness in Decentralized Learning	Nurbek Tastan, Samuel Horvath, Karthik Nandakumar	2025-01-21	下载	Collaborative learning (CL) enables multiple participants to jointly train machine learning (ML) models on decentralized data sources without raw data sharing.
Empower Healthcare through a Self-Sovereign Identity Infrastructure for Secure Electronic Health Data Access	Antonio López Martínez, Montassar Naghmouchi, Maryline Laurent, Joaquin Garcia-Alfaro, Manuel Gil Pérez, Antonio Ruiz Martínez, Pantaleone Nespoli	2025-01-21	下载	Health data is one of the most sensitive data for people, which attracts the attention of malicious activities. We propose an open-source health data management framework, that follows a patient-centr...
Quantifying Energy and Cost Benefits of Hybrid Edge Cloud: Analysis of Traditional and Agentic Workloads	Siavash Alamouti	2025-01-21	下载	This paper examines the workload distribution challenges in centralized cloud systems and demonstrates how Hybrid Edge Cloud (HEC) [1] mitigates these inefficiencies.
AdaServe: Accelerating Multi-SLO LLM Serving with SLO-Customized Speculative Decoding	Zikun Li, Zhuofu Chen, Remi Delacourt, Gabriele Oliaro, Zeyu Wang, Qinghan Chen, Shuhuai Lin, April Yang, Zhihao Zhang, Zhuoming Chen, Sean Lai, Xinhao Cheng, Xupeng Miao, Zhihao Jia	2025-01-21	下载	Modern large language model (LLM) applications exhibit diverse service-level objectives (SLOs), from low-latency requirements in interactive coding assistants to more relaxed constraints in data wrang...
BotDetect: A Decentralized Federated Learning Framework for Detecting Financial Bots on the EVM Blockchains	Ahmed Mounsf Rafik Bendada, Abdelaziz Amara Korba, Mouhamed Amine Bouchiha, Yacine Ghamri-Doudane	2025-01-21	下载	The rapid growth of decentralized finance (DeFi) has led to the widespread use of automated agents, or bots, within blockchain ecosystems like Ethereum, Binance Smart Chain, and Solana.
Dissecting the NVIDIA Hopper Architecture through Microbenchmarking and Multiple Level Analysis	Weile Luo, Ruibo Fan, Zeyu Li, Dayou Du, Hongyuan Liu, Qiang Wang, Xiaowen Chu	2025-01-21	下载	This study presents a comprehensive multi-level analysis of the NVIDIA Hopper GPU architecture, focusing on its performance characteristics and novel features.
$O(1)$ -Round MPC Algorithms for Multi-dimensional Grid Graph Connectivity, EMST and DBSCAN	Junhao Gan, Anthony Wirth, Zhuo Zhang	2025-01-21	下载	In this paper, we investigate three fundamental problems in the Massively Parallel Computation (MPC) model: (i) grid graph connectivity, (ii) approximate Euclidean Minimum Spanning Tree (EMST), and (i...
Accelerating Recommender Model ETL with a Streaming FPGA-GPU Dataflow	Yu Zhu, Wenqi Jiang, Piyumi Jasin Pathiranage, Yongjun He, Gustavo Alonso	2025-01-21	下载	The real-time performance of recommender models depends on the continuous integration of massive volumes of new user interaction data into training pipelines.
Binary integer programming for optimizing ebit cost in distributed quantum circuits with fixed module allocation	Hyunho Cha, Jungwoo Lee	2025-01-21	下载	Modular and networked quantum architectures can scale beyond the qubit count of a single device, but executing a circuit across modules requires implementing non-local two-qubit gates using shared ent...
SPID-Chain: A Smart Contract-Enabled, Polar-Coded Interoperable DAG Chain	Amirhossein Taherpour, Xiaodong Wang	2025-01-21	下载	As the digital landscape evolves, Web3 has gained prominence, highlighting the critical role of decentralized, interconnected, and verifiable digital ecosystems.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Sequence Spreading-Based Semantic Communication Under High RF Interference	Hazem Barka, Georges Kaddoum, Mehdi Bennis, Md Sahabul Alam, Minh Au	2025-01-21	下载	In the evolving landscape of wireless communications, semantic communication (SemCom) has recently emerged as a 6G enabler that prioritizes the transmission of meaning and contextual relevance over co...
Light commodity devices for building vehicular ad hoc networks: An experimental study	Jamal Toutouh, Enrique Alba	2025-01-21	下载	Vehicular communication networks represent both an opportunity and a challenge for providing smart mobility services by using a hybrid solution that relies on cellular connectivity and short range com...
QoS-Aware Radio Access Technology (RAT) Selection in Hybrid Vehicular Networks	Zeeshan Hameed Mir, Jamal Toutouh, Fethi Filali, Enrique Alba	2025-01-21	下载	The increasing number of wireless communication technologies and standards bring immense opportunities and challenges to provide seamless connectivity in Hybrid Vehicular Networks (HVNs).
A Stochastic Geometry Based Techno-Economic Analysis of RIS-Assisted Cellular Networks	Guodong Sun, Francois Baccelli, Luis Uzeda Garcia, Stefano Paris	2025-01-21	下载	Reconfigurable intelligent surfaces (RISs) are a promising technology for enhancing cellular network performance and yielding additional value to network operators.
Harnessing Generative Pre-Trained Transformer for Datacenter Packet Trace Generation	Chen Griner	2025-01-21	下载	Today, the rapid growth of applications reliant on datacenters calls for new advancements to meet the increasing traffic and computational demands.
Power Amplifier-Aware Transmit Power Optimization for OFDM and SC-FDMA Systems	Pawel Kryszkiewicz	2025-01-21	下载	The Single Carrier-Frequency Division Multiple Access (SC-FDMA) is a transmission technique used in the uplink of Long Term Evolution (LTE) and 5G systems, as it is characterized by reduced transmitte...
Message Replication for Improving Reliability of LR-FHSS Direct-to-Satellite IoT	Sonu Rathi, Siddhartha S. Borkotoky	2025-01-21	下载	Long-range frequency-hopping spread spectrum (LR-FHSS) promises to enhance network capacity by integrating frequency hopping into existing Long Range Wide Area Networks (LoRaWANs).
RayLoc: Wireless Indoor Localization via Fully Differentiable Ray-tracing	Xueqiang Han, Tianyue Zheng, Tony Xiao Han, Jun Luo	2025-01-21	下载	Wireless indoor localization has been a pivotal area of research over the last two decades, becoming a cornerstone for numerous sensing applications.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
An Empirical Characterization of Outages and Incidents in Public Services for Large Language Models	Xiaoyu Chu, Sacheendra Talluri, Qingxian Lu, Alexandru Iosup	2025-01-21	下载	People and businesses increasingly rely on public LLM services, such as ChatGPT, DALLE, and Claude. Understanding their outages, and particularly measuring their failure-recovery processes, is becomin...
Dissecting the NVIDIA Hopper Architecture through Microbenchmarking and Multiple Level Analysis	Weile Luo, Ruibo Fan, Zeyu Li, Dayou Du, Hongyuan Liu, Qiang Wang, Xiaowen Chu	2025-01-21	下载	This study presents a comprehensive multi-level analysis of the NVIDIA Hopper GPU architecture, focusing on its performance characteristics and novel features.
A Stochastic Geometry Based Techno-Economic Analysis of RIS-Assisted Cellular Networks	Guodong Sun, Francois Baccelli, Luis Uzeda Garcia, Stefano Paris	2025-01-21	下载	Reconfigurable intelligent surfaces (RISs) are a promising technology for enhancing cellular network performance and yielding additional value to network operators.