Appearance
2024-09-08
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| BBS: Bi-directional Bit-level Sparsity for Deep Learning Acceleration | Yuzong Chen, Jian Meng, Jae-sun Seo, Mohamed S. Abdelfattah | 2024-09-08 | 下载 | Bit-level sparsity methods skip ineffectual zero-bit operations and are typically applicable within bit-serial deep learning accelerators. This type of sparsity at the bit-level is especially interest... |
| InstInfer: In-Storage Attention Offloading for Cost-Effective Long-Context LLM Inference | Xiurui Pan, Endian Li, Qiao Li, Shengwen Liang, Yizhou Shan, Ke Zhou, Yingwei Luo, Xiaolin Wang, Jie Zhang | 2024-09-08 | 下载 | The widespread of Large Language Models (LLMs) marks a significant milestone in generative AI. Nevertheless, the increasing context length and batch size in offline LLM inference escalate the memory r... |
| HYDRA: Hybrid Data Multiplexing and Run-time Layer Configurable DNN Accelerator | Sonu Kumar, Komal Gupta, Gopal Raut, Mukul Lokhande, Santosh Kumar Vishvakarma | 2024-09-08 | 下载 | Deep neural networks (DNNs) offer plenty of challenges in executing efficient computation at edge nodes, primarily due to the huge hardware resource demands. |
| An Analog and Digital Hybrid Attention Accelerator for Transformers with Charge-based In-memory Computing | Ashkan Moradifirouzabadi, Divya Sri Dodla, Mingu Kang | 2024-09-08 | 下载 | The attention mechanism is a key computing kernel of Transformers, calculating pairwise correlations across the entire input sequence. The computing complexity and frequent memory access in computing ... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| FedFT: Improving Communication Performance for Federated Learning with Frequency Space Transformation | Chamath Palihawadana, Nirmalie Wiratunga, Anjana Wijekoon, Harsha Kalutarage | 2024-09-08 | 下载 | Communication efficiency is a widely recognised research problem in Federated Learning (FL), with recent work focused on developing techniques for efficient compression, distribution and aggregation o... |
| CloudNativeSim: a toolkit for modeling and simulation of cloud-native applications | Jingfeng Wu, Minxian Xu, Yiyuan He, Kejiang Ye, Chengzhong Xu | 2024-09-08 | 下载 | Cloud-native applications are increasingly becoming popular in modern software design. Employing a microservice-based architecture into these applications is a prevalent strategy that enhances system ... |
| ARIM-mdx Data System: Towards a Nationwide Data Platform for Materials Science | Masatoshi Hanai, Ryo Ishikawa, Mitsuaki Kawamura, Masato Ohnishi, Norio Takenaka, Kou Nakamura, Daiju Matsumura, Seiji Fujikawa, Hiroki Sakamoto, Yukinori Ochiai, Tetsuo Okane, Shin-Ichiro Kuroki, Atsuo Yamada, Toyotaro Suzumura, Junichiro Shiomi, Kenjiro Taura, Yoshio Mita, Naoya Shibata, Yuichi Ikuhara | 2024-09-08 | 下载 | In modern materials science, effective and high-volume data management across leading-edge experimental facilities and world-class supercomputers is indispensable for cutting-edge research. |
| A Double Tracking Method for Optimization with Decentralized Generalized Orthogonality Constraints | Lei Wang, Nachuan Xiao, Xin Liu | 2024-09-08 | 下载 | In this paper, we consider the decentralized optimization problems with generalized orthogonality constraints, where both the objective function and the constraint exhibit a distributed structure. |
| Elastic On-Device LLM Service | Wangsong Yin, Rongjie Yi, Daliang Xu, Gang Huang, Mengwei Xu, Xuanzhe Liu | 2024-09-08 | 下载 | On-device Large Language Models (LLMs) are transforming mobile AI, catalyzing applications like UI automation without privacy concerns. Nowadays the common practice is to deploy a single yet powerful ... |
| A Comprehensive Analysis of Process Energy Consumption on Multi-Socket Systems with GPUs | Luis G. León-Vega, Niccolò Tosato, Stefano Cozzini | 2024-09-08 | 下载 | Robustly estimating energy consumption in High-Performance Computing (HPC) is essential for assessing the energy footprint of modern workloads, particularly in fields such as Artificial Intelligence (... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| NetDPSyn: Synthesizing Network Traces under Differential Privacy | Danyu Sun, Joann Qiongna Chen, Chen Gong, Tianhao Wang, Zhou Li | 2024-09-08 | 下载 | As the utilization of network traces for the network measurement research becomes increasingly prevalent, concerns regarding privacy leakage from network traces have garnered the public's attention. |
| Towards an AI/ML-driven SMO Framework in O-RAN: Scenarios, Solutions, and Challenges | Mohammad Asif Habibi, Bin Han, Merve Saimler, Ignacio Labrador Pavon, Hans D. Schotten | 2024-09-08 | 下载 | The emergence of the open radio access network (O-RAN) architecture offers a paradigm shift in cellular network management and service orchestration, leveraging data-driven, intent-based, autonomous, ... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| From Concept to Reality: 5G Positioning with Open-Source Implementation of UL-TDoA in OpenAirInterface | Adeel Malik, Mohsen Ahadi, Florian Kaltenberger, Klaus Warnke, Nguyen Tien Thinh, Nada Bouknana, Cedric Thienot, Godswill Onche, Sagar Arora | 2024-09-08 | 下载 | This paper presents, for the first time, an open-source implementation of the 3GPP Uplink Time Difference of Arrival (UL-TDoA) positioning method using the OpenAirInterface (OAI) framework. |
| A Comprehensive Analysis of Process Energy Consumption on Multi-Socket Systems with GPUs | Luis G. León-Vega, Niccolò Tosato, Stefano Cozzini | 2024-09-08 | 下载 | Robustly estimating energy consumption in High-Performance Computing (HPC) is essential for assessing the energy footprint of modern workloads, particularly in fields such as Artificial Intelligence (... |