2025-03-10

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
H3PIMAP: A Heterogeneity-Aware Multi-Objective DNN Mapping Framework on Electronic-Photonic Processing-in-Memory Architectures	Ziang Yin, Aashish Poonia, Ashish Reddy Bommana, Xinyu Zhao, Zahra Hojati, Tianlong Chen, Krishnendu Chakrabarty, Farshad Firouzi, Jeff Zhang, Jiaqi Gu	2025-03-10	下载	The future of artificial intelligence (AI) acceleration demands a paradigm shift beyond the limitations of purely electronic or photonic architectures.
Cool-3D: An End-to-End Thermal-Aware Framework for Early-Phase Design Space Exploration of Microfluidic-Cooled 3DICs	Runxi Wang, Ziheng Wang, Ting Lin, Jacob M. Raby, Mircea R. Stan, Xinfei Guo	2025-03-10	下载	The rapid advancement of three-dimensional integrated circuits (3DICs) has heightened the need for early-phase design space exploration (DSE) to minimize design iterations and unexpected challenges.
An Analytical Cost Model for Fast Evaluation of Multiple Compute-Engine CNN Accelerators	Fareed Qararyah, Mohammad Ali Maleki, Pedro Trancoso	2025-03-10	下载	Convolutional Neural Networks (CNNs) serve various applications with diverse performance and resource requirements. Model-aware CNN accelerators best address these diverse requirements.
xPUE: Extending Power Usage Effectiveness Metrics for Cloud Infrastructures	Guillaume Fieni, Romain Rouvoy, Lionel Seinturier	2025-03-10	下载	The energy consumption analysis and optimization of data centers have been an increasingly popular topic over the past few years. It is widely recognized that several effective metrics exist to captur...
FIGLUT: An Energy-Efficient Accelerator Design for FP-INT GEMM Using Look-Up Tables	Gunho Park, Hyeokjun Kwon, Jiwoo Kim, Jeongin Bae, Baeseong Park, Dongsoo Lee, Youngjoo Lee	2025-03-10	下载	Weight-only quantization has emerged as a promising solution to the deployment challenges of large language models (LLMs). However, it necessitates FP-INT operations, which make implementation on gene...
A Novel Thermal Network Model and Electro-Thermal Coupling Study for NSFETs and CFETs Considering Thermal Crosstalk	Tianci Miao, Qihang Zheng, Yangyang Hu, Xiaoyu Cheng, Jie Liang, Liang Chen, Aiying Guo, Jingjing Liu, Kailin Ren, Jianhua Zhang	2025-03-10	下载	As the technology node continues to shrink, nanosheet field effect transistors (NSFETs) and complementary FETs (CFETs) become valid candidates for the 3nm and sub-nanometre nodes.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Disaggregated Design for GPU-Based Volumetric Data Structures	Massimiliano Meneghin, Ahmed H. Mahmoud	2025-03-10	下载	Volumetric data structures typically prioritize data locality, focusing on efficient memory access patterns. This singular focus can neglect other critical performance factors, such as occupancy, comm...
Right Reward Right Time for Federated Learning	Thanh Linh Nguyen, Dinh Thai Hoang, Diep N. Nguyen, Quoc-Viet Pham	2025-03-10	下载	Critical learning periods (CLPs) in federated learning (FL) refer to early stages during which low-quality contributions (e.g., sparse training data availability) can permanently impair the performanc...
Resource Utilization Optimized Federated Learning	Zihan Zhang, Leon Wong, Blesson Varghese	2025-03-10	下载	Federated learning (FL) systems facilitate distributed machine learning across a server and multiple devices. However, FL systems have low resource utilization limiting their practical use in the real...
Efficient Distributed Learning over Decentralized Networks with Convoluted Support Vector Machine	Canyi Chen, Nan Qiao, Liping Zhu	2025-03-10	下载	This paper addresses the problem of efficiently classifying high-dimensional data over decentralized networks. Penalized support vector machines (SVMs) are widely used for high-dimensional classificat...
From Centralized to Decentralized Federated Learning: Theoretical Insights, Privacy Preservation, and Robustness Challenges	Qiongxiu Li, Wenrui Yu, Yufei Xia, Jun Pang	2025-03-10	下载	Federated Learning (FL) enables collaborative learning without directly sharing individual's raw data. FL can be implemented in either a centralized (server-based) or decentralized (peer-to-peer) mann...
Eva: Cost-Efficient Cloud-Based Cluster Scheduling	Tzu-Tao Chang, Shivaram Venkataraman	2025-03-10	下载	Cloud computing offers flexibility in resource provisioning, allowing an organization to host its batch processing workloads cost-efficiently by dynamically scaling the size and composition of a cloud...
Availability Modeling for Blockchain Provisioning in Private Clouds	J Dantas, P Silva, L Fiondella, C Melo, P Maciel	2025-03-10	下载	Blockchain technology has emerged, and many previous studies have assessed its performance issues. However, less attention has been paid to the dependability attributes, which have been a critical top...
Fair Termination of Asynchronous Binary Sessions	Luca Padovani, Gianluigi Zavattaro	2025-03-10	下载	We study a theory of asynchronous session types ensuring that well-typed processes terminate under a suitable fairness assumption. Fair termination entails starvation freedom and orphan message freedo...
GMB-ECC: Guided Measuring and Benchmarking of the Edge Cloud Continuum	Brian-Frederik Jahnke, Rebecca Schmook, Falk Howar	2025-03-10	下载	In the evolving landscape of cloud computing, optimizing energy efficiency across the edge-cloud continuum is crucial for sustainability and cost-effectiveness.
xPUE: Extending Power Usage Effectiveness Metrics for Cloud Infrastructures	Guillaume Fieni, Romain Rouvoy, Lionel Seinturier	2025-03-10	下载	The energy consumption analysis and optimization of data centers have been an increasingly popular topic over the past few years. It is widely recognized that several effective metrics exist to captur...
Encoding Schemes for Parallel In-Place Algorithms	Chase Hutton, Adam Melrod	2025-03-10	下载	Many parallel algorithms which solve basic problems in computer science use auxiliary space linear in the input to facilitate conflict-free computation.
DynTaskMAS: A Dynamic Task Graph-driven Framework for Asynchronous and Parallel LLM-based Multi-Agent Systems	Junwei Yu, Yepeng Ding, Hiroyuki Sato	2025-03-10	下载	The emergence of Large Language Models (LLMs) in Multi-Agent Systems (MAS) has opened new possibilities for artificial intelligence, yet current implementations face significant challenges in resource...
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference	Suraiya Tairin, Shohaib Mahmud, Haiying Shen, Anand Iyer	2025-03-10	下载	In recent years, Mixture-of-Experts (MoE) has emerged as an effective approach for enhancing the capacity of deep neural network (DNN) with sub-linear computational costs.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Learning-Based Traffic Classification for Mixed-Critical Flows in Time-Sensitive Networking	Rubi Debnath, Luxi Zhao, Sebastian Steinhorst	2025-03-10	下载	Time-Sensitive Networking (TSN) supports multiple traffic types with diverse timing requirements, such as hard real-time (HRT), soft real-time (SRT), and Best Effort (BE) within a single network.
Are We There Yet? A Measurement Study of Efficiency for LLM Applications on Mobile Devices	Xiao Yan, Yi Ding	2025-03-10	下载	Recent advancements in large language models (LLMs) have prompted interest in deploying these models on mobile devices to enable new applications without relying on cloud connectivity.
DRESS: Diffusion Reasoning-based Reward Shaping Scheme For Intelligent Networks	Feiran You, Hongyang Du, Xiangwang Hou, Yong Ren, Kaibin Huang	2025-03-10	下载	Network optimization remains fundamental in wireless communications, with Artificial Intelligence (AI)-based solutions gaining widespread adoption.
The Interplay of AI-and-RAN: Dynamic Resource Allocation for Converged 6G Platform	Syed Danial Ali Shah, Zeinab Nezami, Maryam Hafeez, Syed Ali Raza Zaidi	2025-03-10	下载	The concept of AI-RAN as specified by the AI-RAN alliance is geared to explore a converged 6G platform that can support management, orchestration, and deployment of both AI and RAN workloads.
Federated Learning in NTNs: Design, Architecture and Challenges	Amin Farajzadeh, Animesh Yadav, Halim Yanikomeroglu	2025-03-10	下载	Non-terrestrial networks (NTNs) are emerging as a core component of future 6G communication systems, providing global connectivity and supporting data-intensive applications.
Se-HiLo: Noise-Resilient Semantic Communication with High-and-Low Frequency Decomposition	Zhiyuan Xi, Kun Zhu, Yuanyuan Xu	2025-03-10	下载	Semantic communication has emerged as a transformative paradigm in next-generation communication systems, leveraging advanced artificial intelligence (AI) models to extract and transmit semantic repre...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Disaggregated Design for GPU-Based Volumetric Data Structures	Massimiliano Meneghin, Ahmed H. Mahmoud	2025-03-10	下载	Volumetric data structures typically prioritize data locality, focusing on efficient memory access patterns. This singular focus can neglect other critical performance factors, such as occupancy, comm...
Are We There Yet? A Measurement Study of Efficiency for LLM Applications on Mobile Devices	Xiao Yan, Yi Ding	2025-03-10	下载	Recent advancements in large language models (LLMs) have prompted interest in deploying these models on mobile devices to enable new applications without relying on cloud connectivity.
An Analytical Cost Model for Fast Evaluation of Multiple Compute-Engine CNN Accelerators	Fareed Qararyah, Mohammad Ali Maleki, Pedro Trancoso	2025-03-10	下载	Convolutional Neural Networks (CNNs) serve various applications with diverse performance and resource requirements. Model-aware CNN accelerators best address these diverse requirements.