Skip to content

2025-03-10

cs.AR - Architecture

标题作者发布日期PDF摘要
H3PIMAP: A Heterogeneity-Aware Multi-Objective DNN Mapping Framework on Electronic-Photonic Processing-in-Memory ArchitecturesZiang Yin, Aashish Poonia, Ashish Reddy Bommana, Xinyu Zhao, Zahra Hojati, Tianlong Chen, Krishnendu Chakrabarty, Farshad Firouzi, Jeff Zhang, Jiaqi Gu2025-03-10下载The future of artificial intelligence (AI) acceleration demands a paradigm shift beyond the limitations of purely electronic or photonic architectures.
Cool-3D: An End-to-End Thermal-Aware Framework for Early-Phase Design Space Exploration of Microfluidic-Cooled 3DICsRunxi Wang, Ziheng Wang, Ting Lin, Jacob M. Raby, Mircea R. Stan, Xinfei Guo2025-03-10下载The rapid advancement of three-dimensional integrated circuits (3DICs) has heightened the need for early-phase design space exploration (DSE) to minimize design iterations and unexpected challenges.
An Analytical Cost Model for Fast Evaluation of Multiple Compute-Engine CNN AcceleratorsFareed Qararyah, Mohammad Ali Maleki, Pedro Trancoso2025-03-10下载Convolutional Neural Networks (CNNs) serve various applications with diverse performance and resource requirements. Model-aware CNN accelerators best address these diverse requirements.
xPUE: Extending Power Usage Effectiveness Metrics for Cloud InfrastructuresGuillaume Fieni, Romain Rouvoy, Lionel Seinturier2025-03-10下载The energy consumption analysis and optimization of data centers have been an increasingly popular topic over the past few years. It is widely recognized that several effective metrics exist to captur...
FIGLUT: An Energy-Efficient Accelerator Design for FP-INT GEMM Using Look-Up TablesGunho Park, Hyeokjun Kwon, Jiwoo Kim, Jeongin Bae, Baeseong Park, Dongsoo Lee, Youngjoo Lee2025-03-10下载Weight-only quantization has emerged as a promising solution to the deployment challenges of large language models (LLMs). However, it necessitates FP-INT operations, which make implementation on gene...
A Novel Thermal Network Model and Electro-Thermal Coupling Study for NSFETs and CFETs Considering Thermal CrosstalkTianci Miao, Qihang Zheng, Yangyang Hu, Xiaoyu Cheng, Jie Liang, Liang Chen, Aiying Guo, Jingjing Liu, Kailin Ren, Jianhua Zhang2025-03-10下载As the technology node continues to shrink, nanosheet field effect transistors (NSFETs) and complementary FETs (CFETs) become valid candidates for the 3nm and sub-nanometre nodes.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Disaggregated Design for GPU-Based Volumetric Data StructuresMassimiliano Meneghin, Ahmed H. Mahmoud2025-03-10下载Volumetric data structures typically prioritize data locality, focusing on efficient memory access patterns. This singular focus can neglect other critical performance factors, such as occupancy, comm...
Right Reward Right Time for Federated LearningThanh Linh Nguyen, Dinh Thai Hoang, Diep N. Nguyen, Quoc-Viet Pham2025-03-10下载Critical learning periods (CLPs) in federated learning (FL) refer to early stages during which low-quality contributions (e.g., sparse training data availability) can permanently impair the performanc...
Resource Utilization Optimized Federated LearningZihan Zhang, Leon Wong, Blesson Varghese2025-03-10下载Federated learning (FL) systems facilitate distributed machine learning across a server and multiple devices. However, FL systems have low resource utilization limiting their practical use in the real...
Efficient Distributed Learning over Decentralized Networks with Convoluted Support Vector MachineCanyi Chen, Nan Qiao, Liping Zhu2025-03-10下载This paper addresses the problem of efficiently classifying high-dimensional data over decentralized networks. Penalized support vector machines (SVMs) are widely used for high-dimensional classificat...
From Centralized to Decentralized Federated Learning: Theoretical Insights, Privacy Preservation, and Robustness ChallengesQiongxiu Li, Wenrui Yu, Yufei Xia, Jun Pang2025-03-10下载Federated Learning (FL) enables collaborative learning without directly sharing individual's raw data. FL can be implemented in either a centralized (server-based) or decentralized (peer-to-peer) mann...
Eva: Cost-Efficient Cloud-Based Cluster SchedulingTzu-Tao Chang, Shivaram Venkataraman2025-03-10下载Cloud computing offers flexibility in resource provisioning, allowing an organization to host its batch processing workloads cost-efficiently by dynamically scaling the size and composition of a cloud...
Availability Modeling for Blockchain Provisioning in Private CloudsJ Dantas, P Silva, L Fiondella, C Melo, P Maciel2025-03-10下载Blockchain technology has emerged, and many previous studies have assessed its performance issues. However, less attention has been paid to the dependability attributes, which have been a critical top...
Fair Termination of Asynchronous Binary SessionsLuca Padovani, Gianluigi Zavattaro2025-03-10下载We study a theory of asynchronous session types ensuring that well-typed processes terminate under a suitable fairness assumption. Fair termination entails starvation freedom and orphan message freedo...
GMB-ECC: Guided Measuring and Benchmarking of the Edge Cloud ContinuumBrian-Frederik Jahnke, Rebecca Schmook, Falk Howar2025-03-10下载In the evolving landscape of cloud computing, optimizing energy efficiency across the edge-cloud continuum is crucial for sustainability and cost-effectiveness.
xPUE: Extending Power Usage Effectiveness Metrics for Cloud InfrastructuresGuillaume Fieni, Romain Rouvoy, Lionel Seinturier2025-03-10下载The energy consumption analysis and optimization of data centers have been an increasingly popular topic over the past few years. It is widely recognized that several effective metrics exist to captur...
Encoding Schemes for Parallel In-Place AlgorithmsChase Hutton, Adam Melrod2025-03-10下载Many parallel algorithms which solve basic problems in computer science use auxiliary space linear in the input to facilitate conflict-free computation.
DynTaskMAS: A Dynamic Task Graph-driven Framework for Asynchronous and Parallel LLM-based Multi-Agent SystemsJunwei Yu, Yepeng Ding, Hiroyuki Sato2025-03-10下载The emergence of Large Language Models (LLMs) in Multi-Agent Systems (MAS) has opened new possibilities for artificial intelligence, yet current implementations face significant challenges in resource...
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model InferenceSuraiya Tairin, Shohaib Mahmud, Haiying Shen, Anand Iyer2025-03-10下载In recent years, Mixture-of-Experts (MoE) has emerged as an effective approach for enhancing the capacity of deep neural network (DNN) with sub-linear computational costs.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Learning-Based Traffic Classification for Mixed-Critical Flows in Time-Sensitive NetworkingRubi Debnath, Luxi Zhao, Sebastian Steinhorst2025-03-10下载Time-Sensitive Networking (TSN) supports multiple traffic types with diverse timing requirements, such as hard real-time (HRT), soft real-time (SRT), and Best Effort (BE) within a single network.
Are We There Yet? A Measurement Study of Efficiency for LLM Applications on Mobile DevicesXiao Yan, Yi Ding2025-03-10下载Recent advancements in large language models (LLMs) have prompted interest in deploying these models on mobile devices to enable new applications without relying on cloud connectivity.
DRESS: Diffusion Reasoning-based Reward Shaping Scheme For Intelligent NetworksFeiran You, Hongyang Du, Xiangwang Hou, Yong Ren, Kaibin Huang2025-03-10下载Network optimization remains fundamental in wireless communications, with Artificial Intelligence (AI)-based solutions gaining widespread adoption.
The Interplay of AI-and-RAN: Dynamic Resource Allocation for Converged 6G PlatformSyed Danial Ali Shah, Zeinab Nezami, Maryam Hafeez, Syed Ali Raza Zaidi2025-03-10下载The concept of AI-RAN as specified by the AI-RAN alliance is geared to explore a converged 6G platform that can support management, orchestration, and deployment of both AI and RAN workloads.
Federated Learning in NTNs: Design, Architecture and ChallengesAmin Farajzadeh, Animesh Yadav, Halim Yanikomeroglu2025-03-10下载Non-terrestrial networks (NTNs) are emerging as a core component of future 6G communication systems, providing global connectivity and supporting data-intensive applications.
Se-HiLo: Noise-Resilient Semantic Communication with High-and-Low Frequency DecompositionZhiyuan Xi, Kun Zhu, Yuanyuan Xu2025-03-10下载Semantic communication has emerged as a transformative paradigm in next-generation communication systems, leveraging advanced artificial intelligence (AI) models to extract and transmit semantic repre...

cs.PF - Performance

标题作者发布日期PDF摘要
Disaggregated Design for GPU-Based Volumetric Data StructuresMassimiliano Meneghin, Ahmed H. Mahmoud2025-03-10下载Volumetric data structures typically prioritize data locality, focusing on efficient memory access patterns. This singular focus can neglect other critical performance factors, such as occupancy, comm...
Are We There Yet? A Measurement Study of Efficiency for LLM Applications on Mobile DevicesXiao Yan, Yi Ding2025-03-10下载Recent advancements in large language models (LLMs) have prompted interest in deploying these models on mobile devices to enable new applications without relying on cloud connectivity.
An Analytical Cost Model for Fast Evaluation of Multiple Compute-Engine CNN AcceleratorsFareed Qararyah, Mohammad Ali Maleki, Pedro Trancoso2025-03-10下载Convolutional Neural Networks (CNNs) serve various applications with diverse performance and resource requirements. Model-aware CNN accelerators best address these diverse requirements.

基于 VitePress 构建