Appearance
2024-09-06
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Parallax: A Compiler for Neutral Atom Quantum Computers under Hardware Constraints | Jason Ludmir, Tirthak Patel | 2024-09-06 | 下载 | Among different quantum computing technologies, neutral atom quantum computers have several advantageous features, such as multi-qubit gates, application-specific topologies, movable qubits, homogenou... |
| OPAL: Outlier-Preserved Microscaling Quantization Accelerator for Generative Large Language Models | Jahyun Koo, Dahoon Park, Sangwoo Jung, Jaeha Kung | 2024-09-06 | 下载 | To overcome the burden on the memory size and bandwidth due to ever-increasing size of large language models (LLMs), aggressive weight quantization has been recently studied, while lacking research on... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| CubicML: Automated ML for Large ML Systems Co-design with ML Prediction of Performance | Wei Wen, Quanyu Zhu, Weiwei Chu, Wen-Yen Chen, Jiyan Yang | 2024-09-06 | 下载 | Scaling up deep learning models has been proven effective to improve intelligence of machine learning (ML) models, especially for industry recommendation models and large language models. |
| Hermes: Memory-Efficient Pipeline Inference for Large Models on Edge Devices | Xueyuan Han, Zinuo Cai, Yichu Zhang, Chongxin Fan, Junhan Liu, Ruhui Ma, Rajkumar Buyya | 2024-09-06 | 下载 | The application of Transformer-based large models has achieved numerous success in recent years. However, the exponential growth in the parameters of large models introduces formidable memory challeng... |
| Revisiting the Time Cost Model of AllReduce | Dian Xiong, Li Chen, Youhe Jiang, Dan Li, Shuai Wang, Songtao Wang | 2024-09-06 | 下载 | AllReduce is an important and popular collective communication primitive, which has been widely used in areas such as distributed machine learning and high performance computing. |
| 3D System Design: A Case for Building Customized Modular Systems in 3D | Philip Emma, Eren Kurshan | 2024-09-06 | 下载 | 3D promises a new dimension in composing systems by aggregating chips. Literally. While the most common uses are still tightly connected with its early forms as a packaging technology, new application... |
| Heterogeneity-Aware Cooperative Federated Edge Learning with Adaptive Computation and Communication Compression | Zhenxiao Zhang, Zhidong Gao, Yuanxiong Guo, Yanmin Gong | 2024-09-06 | 下载 | Motivated by the drawbacks of cloud-based federated learning (FL), cooperative federated edge learning (CFEL) has been proposed to improve efficiency for FL over mobile edge networks, where multiple e... |
| Confidential Computing on NVIDIA Hopper GPUs: A Performance Benchmark Study | Jianwei Zhu, Hang Yin, Peng Deng, Aline Almeida, Shunfan Zhou | 2024-09-06 | 下载 | This report evaluates the performance impact of enabling Trusted Execution Environments (TEE) on NVIDIA Hopper GPUs for large language model (LLM) inference tasks. |
| A Hybrid Vectorized Merge Sort on ARM NEON | Jincheng Zhou, Jin Zhang, Xiang Zhang, Tiaojie Xiao, Di Ma, Chunye Gong | 2024-09-06 | 下载 | Sorting algorithms are the most extensively researched topics in computer science and serve for numerous practical applications. Although various sorts have been proposed for efficiency, different arc... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Digital Twin Enabled Data-Driven Approach for Traffic Efficiency and Software-Defined Vehicular Network Optimization | Mohammad Sajid Shahriar, Suresh Subramaniam, Motoharu Matsuura, Hiroshi Hasegawa, Shih-Chun Lin | 2024-09-06 | 下载 | In the realms of the internet of vehicles (IoV) and intelligent transportation systems (ITS), software defined vehicular networks (SDVN) and edge computing (EC) have emerged as promising technologies ... |
| Fast Adaptation for Deep Learning-based Wireless Communications | Ouya Wang, Hengtao He, Shenglong Zhou, Zhi Ding, Shi Jin, Khaled B. Letaief, Geoffrey Ye Li | 2024-09-06 | 下载 | The integration with artificial intelligence (AI) is recognized as one of the six usage scenarios in next-generation wireless communications. However, several critical challenges hinder the widespread... |
| Minimizing Power Consumption under SINR Constraints for Cell-Free Massive MIMO in O-RAN | Vaishnavi Kasuluru, Luis Blanco, Miguel Angel Vazquez, Cristian J. Vaca-Rubio, Engin Zeydan | 2024-09-06 | 下载 | This paper deals with the problem of energy consumption minimization in Open RAN cell-free (CF) massive Multiple-Input Multiple-Output (mMIMO) systems under minimum per-user signal-to-noise-plus-inter... |
| A Centralized Discovery-Based Method for Integrating Data Distribution Service and Time-Sensitive Networking in In-Vehicle Networks | Feng Luo, Yi Ren, Yanhua Yu, Yunpeng Li, Zitong Wang | 2024-09-06 | 下载 | As the electronic and electrical architecture (E/EA) of intelligent and connected vehicles (ICVs) evolves, traditional distributed and signal-oriented architectures are being replaced by centralized, ... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| The HitchHiker's Guide to High-Assurance System Observability Protection with Efficient Permission Switches | Chuqi Zhang, Jun Zeng, Yiming Zhang, Adil Ahmad, Fengwei Zhang, Hai Jin, Zhenkai Liang | 2024-09-06 | 下载 | Protecting system observability records (logs) from compromised OSs has gained significant traction in recent times, with several note-worthy approaches proposed. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Confidential Computing on NVIDIA Hopper GPUs: A Performance Benchmark Study | Jianwei Zhu, Hang Yin, Peng Deng, Aline Almeida, Shunfan Zhou | 2024-09-06 | 下载 | This report evaluates the performance impact of enabling Trusted Execution Environments (TEE) on NVIDIA Hopper GPUs for large language model (LLM) inference tasks. |