Skip to content

2024-09-08

cs.AR - Architecture

标题作者发布日期PDF摘要
BBS: Bi-directional Bit-level Sparsity for Deep Learning AccelerationYuzong Chen, Jian Meng, Jae-sun Seo, Mohamed S. Abdelfattah2024-09-08下载Bit-level sparsity methods skip ineffectual zero-bit operations and are typically applicable within bit-serial deep learning accelerators. This type of sparsity at the bit-level is especially interest...
InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM InferenceXiurui Pan, Endian Li, Qiao Li, Shengwen Liang, Yizhou Shan, Ke Zhou, Yingwei Luo, Xiaolin Wang, Jie Zhang2024-09-08下载The widespread of Large Language Models (LLMs) marks a significant milestone in generative AI. Nevertheless, the increasing context length and batch size in offline LLM inference escalate the memory r...
HYDRA: Hybrid Data Multiplexing and Run-time Layer Configurable DNN AcceleratorSonu Kumar, Komal Gupta, Gopal Raut, Mukul Lokhande, Santosh Kumar Vishvakarma2024-09-08下载Deep neural networks (DNNs) offer plenty of challenges in executing efficient computation at edge nodes, primarily due to the huge hardware resource demands.
An Analog and Digital Hybrid Attention Accelerator for Transformers with Charge-based In-memory ComputingAshkan Moradifirouzabadi, Divya Sri Dodla, Mingu Kang2024-09-08下载The attention mechanism is a key computing kernel of Transformers, calculating pairwise correlations across the entire input sequence. The computing complexity and frequent memory access in computing ...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
FedFT: Improving Communication Performance for Federated Learning with Frequency Space TransformationChamath Palihawadana, Nirmalie Wiratunga, Anjana Wijekoon, Harsha Kalutarage2024-09-08下载Communication efficiency is a widely recognised research problem in Federated Learning (FL), with recent work focused on developing techniques for efficient compression, distribution and aggregation o...
CloudNativeSim: a toolkit for modeling and simulation of cloud-native applicationsJingfeng Wu, Minxian Xu, Yiyuan He, Kejiang Ye, Chengzhong Xu2024-09-08下载Cloud-native applications are increasingly becoming popular in modern software design. Employing a microservice-based architecture into these applications is a prevalent strategy that enhances system ...
ARIM-mdx Data System: Towards a Nationwide Data Platform for Materials ScienceMasatoshi Hanai, Ryo Ishikawa, Mitsuaki Kawamura, Masato Ohnishi, Norio Takenaka, Kou Nakamura, Daiju Matsumura, Seiji Fujikawa, Hiroki Sakamoto, Yukinori Ochiai, Tetsuo Okane, Shin-Ichiro Kuroki, Atsuo Yamada, Toyotaro Suzumura, Junichiro Shiomi, Kenjiro Taura, Yoshio Mita, Naoya Shibata, Yuichi Ikuhara2024-09-08下载In modern materials science, effective and high-volume data management across leading-edge experimental facilities and world-class supercomputers is indispensable for cutting-edge research.
A Double Tracking Method for Optimization with Decentralized Generalized Orthogonality ConstraintsLei Wang, Nachuan Xiao, Xin Liu2024-09-08下载In this paper, we consider the decentralized optimization problems with generalized orthogonality constraints, where both the objective function and the constraint exhibit a distributed structure.
Elastic On-Device LLM ServiceWangsong Yin, Rongjie Yi, Daliang Xu, Gang Huang, Mengwei Xu, Xuanzhe Liu2024-09-08下载On-device Large Language Models (LLMs) are transforming mobile AI, catalyzing applications like UI automation without privacy concerns. Nowadays the common practice is to deploy a single yet powerful ...
A Comprehensive Analysis of Process Energy Consumption on Multi-Socket Systems with GPUsLuis G. León-Vega, Niccolò Tosato, Stefano Cozzini2024-09-08下载Robustly estimating energy consumption in High-Performance Computing (HPC) is essential for assessing the energy footprint of modern workloads, particularly in fields such as Artificial Intelligence (...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
NetDPSyn: Synthesizing Network Traces under Differential PrivacyDanyu Sun, Joann Qiongna Chen, Chen Gong, Tianhao Wang, Zhou Li2024-09-08下载As the utilization of network traces for the network measurement research becomes increasingly prevalent, concerns regarding privacy leakage from network traces have garnered the public's attention.
Towards an AI/ML-driven SMO Framework in O-RAN: Scenarios, Solutions, and ChallengesMohammad Asif Habibi, Bin Han, Merve Saimler, Ignacio Labrador Pavon, Hans D. Schotten2024-09-08下载The emergence of the open radio access network (O-RAN) architecture offers a paradigm shift in cellular network management and service orchestration, leveraging data-driven, intent-based, autonomous, ...

cs.PF - Performance

标题作者发布日期PDF摘要
From Concept to Reality: 5G Positioning with Open-Source Implementation of UL-TDoA in OpenAirInterfaceAdeel Malik, Mohsen Ahadi, Florian Kaltenberger, Klaus Warnke, Nguyen Tien Thinh, Nada Bouknana, Cedric Thienot, Godswill Onche, Sagar Arora2024-09-08下载This paper presents, for the first time, an open-source implementation of the 3GPP Uplink Time Difference of Arrival (UL-TDoA) positioning method using the OpenAirInterface (OAI) framework.
A Comprehensive Analysis of Process Energy Consumption on Multi-Socket Systems with GPUsLuis G. León-Vega, Niccolò Tosato, Stefano Cozzini2024-09-08下载Robustly estimating energy consumption in High-Performance Computing (HPC) is essential for assessing the energy footprint of modern workloads, particularly in fields such as Artificial Intelligence (...

基于 VitePress 构建