2024-09-08

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
BBS: Bi-directional Bit-level Sparsity for Deep Learning Acceleration	Yuzong Chen, Jian Meng, Jae-sun Seo, Mohamed S. Abdelfattah	2024-09-08	下载	Bit-level sparsity methods skip ineffectual zero-bit operations and are typically applicable within bit-serial deep learning accelerators. This type of sparsity at the bit-level is especially interest...
InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference	Xiurui Pan, Endian Li, Qiao Li, Shengwen Liang, Yizhou Shan, Ke Zhou, Yingwei Luo, Xiaolin Wang, Jie Zhang	2024-09-08	下载	The widespread of Large Language Models (LLMs) marks a significant milestone in generative AI. Nevertheless, the increasing context length and batch size in offline LLM inference escalate the memory r...
HYDRA: Hybrid Data Multiplexing and Run-time Layer Configurable DNN Accelerator	Sonu Kumar, Komal Gupta, Gopal Raut, Mukul Lokhande, Santosh Kumar Vishvakarma	2024-09-08	下载	Deep neural networks (DNNs) offer plenty of challenges in executing efficient computation at edge nodes, primarily due to the huge hardware resource demands.
An Analog and Digital Hybrid Attention Accelerator for Transformers with Charge-based In-memory Computing	Ashkan Moradifirouzabadi, Divya Sri Dodla, Mingu Kang	2024-09-08	下载	The attention mechanism is a key computing kernel of Transformers, calculating pairwise correlations across the entire input sequence. The computing complexity and frequent memory access in computing ...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
FedFT: Improving Communication Performance for Federated Learning with Frequency Space Transformation	Chamath Palihawadana, Nirmalie Wiratunga, Anjana Wijekoon, Harsha Kalutarage	2024-09-08	下载	Communication efficiency is a widely recognised research problem in Federated Learning (FL), with recent work focused on developing techniques for efficient compression, distribution and aggregation o...
CloudNativeSim: a toolkit for modeling and simulation of cloud-native applications	Jingfeng Wu, Minxian Xu, Yiyuan He, Kejiang Ye, Chengzhong Xu	2024-09-08	下载	Cloud-native applications are increasingly becoming popular in modern software design. Employing a microservice-based architecture into these applications is a prevalent strategy that enhances system ...
ARIM-mdx Data System: Towards a Nationwide Data Platform for Materials Science	Masatoshi Hanai, Ryo Ishikawa, Mitsuaki Kawamura, Masato Ohnishi, Norio Takenaka, Kou Nakamura, Daiju Matsumura, Seiji Fujikawa, Hiroki Sakamoto, Yukinori Ochiai, Tetsuo Okane, Shin-Ichiro Kuroki, Atsuo Yamada, Toyotaro Suzumura, Junichiro Shiomi, Kenjiro Taura, Yoshio Mita, Naoya Shibata, Yuichi Ikuhara	2024-09-08	下载	In modern materials science, effective and high-volume data management across leading-edge experimental facilities and world-class supercomputers is indispensable for cutting-edge research.
A Double Tracking Method for Optimization with Decentralized Generalized Orthogonality Constraints	Lei Wang, Nachuan Xiao, Xin Liu	2024-09-08	下载	In this paper, we consider the decentralized optimization problems with generalized orthogonality constraints, where both the objective function and the constraint exhibit a distributed structure.
Elastic On-Device LLM Service	Wangsong Yin, Rongjie Yi, Daliang Xu, Gang Huang, Mengwei Xu, Xuanzhe Liu	2024-09-08	下载	On-device Large Language Models (LLMs) are transforming mobile AI, catalyzing applications like UI automation without privacy concerns. Nowadays the common practice is to deploy a single yet powerful ...
A Comprehensive Analysis of Process Energy Consumption on Multi-Socket Systems with GPUs	Luis G. León-Vega, Niccolò Tosato, Stefano Cozzini	2024-09-08	下载	Robustly estimating energy consumption in High-Performance Computing (HPC) is essential for assessing the energy footprint of modern workloads, particularly in fields such as Artificial Intelligence (...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
NetDPSyn: Synthesizing Network Traces under Differential Privacy	Danyu Sun, Joann Qiongna Chen, Chen Gong, Tianhao Wang, Zhou Li	2024-09-08	下载	As the utilization of network traces for the network measurement research becomes increasingly prevalent, concerns regarding privacy leakage from network traces have garnered the public's attention.
Towards an AI/ML-driven SMO Framework in O-RAN: Scenarios, Solutions, and Challenges	Mohammad Asif Habibi, Bin Han, Merve Saimler, Ignacio Labrador Pavon, Hans D. Schotten	2024-09-08	下载	The emergence of the open radio access network (O-RAN) architecture offers a paradigm shift in cellular network management and service orchestration, leveraging data-driven, intent-based, autonomous, ...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
From Concept to Reality: 5G Positioning with Open-Source Implementation of UL-TDoA in OpenAirInterface	Adeel Malik, Mohsen Ahadi, Florian Kaltenberger, Klaus Warnke, Nguyen Tien Thinh, Nada Bouknana, Cedric Thienot, Godswill Onche, Sagar Arora	2024-09-08	下载	This paper presents, for the first time, an open-source implementation of the 3GPP Uplink Time Difference of Arrival (UL-TDoA) positioning method using the OpenAirInterface (OAI) framework.
A Comprehensive Analysis of Process Energy Consumption on Multi-Socket Systems with GPUs	Luis G. León-Vega, Niccolò Tosato, Stefano Cozzini	2024-09-08	下载	Robustly estimating energy consumption in High-Performance Computing (HPC) is essential for assessing the energy footprint of modern workloads, particularly in fields such as Artificial Intelligence (...