Appearance
2024-06-12
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference | Christopher Wolters, Xiaoxuan Yang, Ulf Schlichtmann, Toyotaro Suzumura | 2024-06-12 | 下载 | Large language models (LLMs) have recently transformed natural language processing, enabling machines to generate human-like text and engage in meaningful conversations. |
| Continuous-Time Digital Twin with Analogue Memristive Neural Ordinary Differential Equation Solver | Hegan Chen, Jichang Yang, Jia Chen, Songqi Wang, Shaocong Wang, Dingchen Wang, Xinyu Tian, Yifei Yu, Xi Chen, Yinan Lin, Yangu He, Xiaoshan Wu, Yi Li, Xinyuan Zhang, Ning Lin, Meng Xu, Yi Li, Xumeng Zhang, Zhongrui Wang, Han Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu | 2024-06-12 | 下载 | Digital twins, the cornerstone of Industry 4.0, replicate real-world entities through computer models, revolutionising fields such as manufacturing management and industrial automation. |
| It's all about PR -- Smart Benchmarking AI Accelerators using Performance Representatives | Alexander Louis-Ferdinand Jung, Jannik Steinmetz, Jonathan Gietz, Konstantin Lübeck, Oliver Bringmann | 2024-06-12 | 下载 | Statistical models are widely used to estimate the performance of commercial off-the-shelf (COTS) AI hardware accelerators. However, training of statistical performance models often requires vast amou... |
| Hypervisor Extension for a RISC-V Processor | Jaume Gauchola, JuanJosé Costa, Enric Morancho, Ramon Canal, Xavier Carril, Max Doblas, Beatriz Otero, Alex Pajuelo, Eva Rodríguez, Javier Salamero, Javier Verdú | 2024-06-12 | 下载 | This paper describes our experience implementing a Hypervisor extension for a 64-bit RISC-V processor. We describe the design process and the main required parts with a brief explanation of each one. |
| ONNXim: A Fast, Cycle-level Multi-core NPU Simulator | Hyungkyu Ham, Wonhyuk Yang, Yunseon Shin, Okkyun Woo, Guseul Heo, Sangyeop Lee, Jongse Park, Gwangsun Kim | 2024-06-12 | 下载 | As DNNs are widely adopted in various application domains while demanding increasingly higher compute and memory requirements, designing efficient and performant NPUs (Neural Processing Units) is beco... |
| Hardware Implementation of Soft Mapper/Demappers in Iterative EP-based Receivers | Ian Fischer Schilling, Serdar Sahin, Camille Leroux, Antonio Maria Cipriano, Christophe Jego | 2024-06-12 | 下载 | This paper presents a comprehensive study and implementations onto FPGA device of an Expectation Propagation (EP)-based receiver for QPSK, 8-PSK, and 16-QAM. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| PETSc/TAO Developments for GPU-Based Early Exascale Systems | Richard Tran Mills, Mark Adams, Satish Balay, Jed Brown, Jacob Faibussowitsch, Toby Isaac, Matthew Knepley, Todd Munson, Hansol Suh, Stefano Zampini, Hong Zhang, Junchao Zhang | 2024-06-12 | 下载 | The Portable Extensible Toolkit for Scientific Computation (PETSc) library provides scalable solvers for nonlinear time-dependent differential and algebraic equations and for numerical optimization vi... |
| Nonconvex Federated Learning on Compact Smooth Submanifolds With Heterogeneous Data | Jiaojiao Zhang, Jiang Hu, Anthony Man-Cho So, Mikael Johansson | 2024-06-12 | 下载 | Many machine learning tasks, such as principal component analysis and low-rank matrix completion, give rise to manifold optimization problems. |
| ProTrain: Efficient LLM Training via Memory-Aware Techniques | Hanmei Yang, Jin Zhou, Yao Fu, Xiaoqun Wang, Ramine Roane, Hui Guan, Tongping Liu | 2024-06-12 | 下载 | It is extremely memory-hungry to train Large Language Models (LLM). To solve this problem, existing work exploits the combination of CPU and GPU for the training process, such as ZeRO-Offload. |
| A deep cut into Split Federated Self-supervised Learning | Marcin Przewięźlikowski, Marcin Osial, Bartosz Zieliński, Marek Śmieja | 2024-06-12 | 下载 | Collaborative self-supervised learning has recently become feasible in highly distributed environments by dividing the network layers between client devices and a central server. |
| Vitamin-V: Expanding Open-Source RISC-V Cloud Environments | Ramon Canal, Stefano Di Carlo, Dimitris Gizopoulos, Alberto Scionti, Francesco Lubrano, Josep-Lluís Berral, Aaron Call, Diego Marron, Konstantinos Nikas, Dionisios Pnevmatikatos, Daniel Raho, Alvise Rigo, Yannis Papaefstathiou, José María Arnau, Angelos Arelakis | 2024-06-12 | 下载 | Among the key contributions of Vitamin-V (2023-2025 Horizon Europe project), we develop a complete RISC-V open-source software stack for cloud services with comparable performance to the cloud-dominan... |
| Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey | Feng Liang, Zhen Zhang, Haifeng Lu, Chengming Li, Victor C. M. Leung, Yanyi Guo, Xiping Hu | 2024-06-12 | 下载 | With rapidly increasing distributed deep learning workloads in large-scale data centers, efficient distributed deep learning framework strategies for resource allocation and workload scheduling have b... |
| IMFL-AIGC: Incentive Mechanism Design for Federated Learning Empowered by Artificial Intelligence Generated Content | Guangjing Huang, Qiong Wu, Jingyi Li, Xu Chen | 2024-06-12 | 下载 | Federated learning (FL) has emerged as a promising paradigm that enables clients to collaboratively train a shared global model without uploading their local data. |
| Federated Incomplete Multi-View Clustering with Heterogeneous Graph Neural Networks | Xueming Yan, Ziqi Wang, Yaochu Jin | 2024-06-12 | 下载 | Federated multi-view clustering offers the potential to develop a global clustering model using data distributed across multiple devices. However, current methods face challenges due to the absence of... |
| Emergent Peer-to-Peer Multi-Hub Topology | Mohamed Amine Legheraba, Maria Potop-Butucaru, Sébastien Tixeuil, Serge Fdida | 2024-06-12 | 下载 | In this paper we propose and evaluate an innovative algorithm that enables the creation of Peer-to-Peer network overlays characterized by emergent multi-hubs. |
| Computational Performance and Energy Efficiency of ARM based HPC servers | Oskar Schirmer | 2024-06-12 | 下载 | HPC world is dominated by x86 ISA CPUs. This monoculture is not necessarily justified by best performance evaluation, but may inherit from e.g. SW related restrictions on the choice of HW platforms. |
| FDLoRA: Personalized Federated Learning of Large Language Model via Dual LoRA Tuning | Jiaxing QI, Zhongzhi Luan, Shaohan Huang, Carol Fung, Hailong Yang, Depei Qian | 2024-06-12 | 下载 | Large language models (LLMs) have emerged as important components across various fields, yet their training requires substantial computation resources and abundant labeled data. |
| Class-Wise Federated Averaging for Efficient Personalization | Gyuejeong Lee, Daeyoung Choi | 2024-06-12 | 下载 | Federated learning (FL) enables collaborative model training across distributed clients without centralizing data. However, existing approaches such as Federated Averaging (FedAvg) often perform poorl... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Enhancing Path Selections with Interference Graphs in Multihop Relay Wireless Networks | Cao Vien Phung, Andre Drummond, Admela Jukan | 2024-06-12 | 下载 | The multihop relay wireless networks have gained traction due to the emergence of Reconfigurable Intelligent Surfaces (RISs) which can be used as relays in high frequency range wireless network, inclu... |
| Machine Learning-Driven Open-Source Framework for Assessing QoE in Multimedia Networks | Parsa Hassani Shariat Panahi, Amir Hossein Jalilvand, Abolfazl Diyanat | 2024-06-12 | 下载 | The Internet is integral to modern life, influencing communication, business, and lifestyles globally. As dependence on Internet services grows, the demand for high-quality service delivery increases. |
| MSADM: Large Language Model (LLM) Assisted End-to-End Network Health Management Based on Multi-Scale Semanticization | Fengxiao Tang, Xiaonan Wang, Xun Yuan, Linfeng Luo, Ming Zhao, Tianchi Huang, Nei Kato | 2024-06-12 | 下载 | Network device and system health management is the foundation of modern network operations and maintenance. Traditional health management methods, relying on expert identification or simple rule-based... |
| Systematic Literature Review of AI-enabled Spectrum Management in 6G and Future Networks | Bushra Sabir, Shuiqiao Yang, David Nguyen, Nan Wu, Alsharif Abuadbba, Hajime Suzuki, Shangqi Lai, Wei Ni, Ding Ming, Surya Nepal | 2024-06-12 | 下载 | Artificial Intelligence (AI) has advanced significantly in various domains like healthcare, finance, and cybersecurity, with successes such as DeepMind's medical imaging and Tesla's autonomous vehicle... |
| Efficient Network Traffic Feature Sets for IoT Intrusion Detection | Miguel Silva, João Vitorino, Eva Maia, Isabel Praça | 2024-06-12 | 下载 | The use of Machine Learning (ML) models in cybersecurity solutions requires high-quality data that is stripped of redundant, missing, and noisy information. |
| Semantic-Aware Resource Allocation Based on Deep Reinforcement Learning for 5G-V2X HetNets | Zhiyu Shao, Qiong Wu, Pingyi Fan, Nan Cheng, Qiang Fan, Jiangzhou Wang | 2024-06-12 | 下载 | This letter proposes a semantic-aware resource allocation (SARA) framework with flexible duty cycle (DC) coexistence mechanism (SARADC) for 5G-V2X Heterogeneous Network (HetNets) based on deep reinfor... |
| Demonstration of Safe Electromagnetic Radiation Emitted by 5G Active Antenna Systems | Sumit Kumar, Chandan Kumar Sheemar, Abdelrahman Astro, Jorge Querol, Symeon Chatzinotas | 2024-06-12 | 下载 | The careful planning and safe deployment of 5G technologies will bring enormous benefits to society and the economy. Higher frequency, beamforming, and small-cells are key technologies that will provi... |
| Content Provider Contributions to Capacity Expansion of a Neutral ISP: Effect of Private Option | Pranay Agarwal, D. Manjunath | 2024-06-12 | 下载 | Increasing content consumption by users and the expectation of a better Internet experience requires Internet service providers (ISPs) to expand the capacity of the access network continually. |
| Toward Enhanced Reinforcement Learning-Based Resource Management via Digital Twin: Opportunities, Applications, and Challenges | Nan Cheng, Xiucheng Wang, Zan Li, Zhisheng Yin, Tom Luan, Xuemin Shen | 2024-06-12 | 下载 | This article presents a digital twin (DT)-enhanced reinforcement learning (RL) framework aimed at optimizing performance and reliability in network resource management, since the traditional RL method... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ProTrain: Efficient LLM Training via Memory-Aware Techniques | Hanmei Yang, Jin Zhou, Yao Fu, Xiaoqun Wang, Ramine Roane, Hui Guan, Tongping Liu | 2024-06-12 | 下载 | It is extremely memory-hungry to train Large Language Models (LLM). To solve this problem, existing work exploits the combination of CPU and GPU for the training process, such as ZeRO-Offload. |
| It's all about PR -- Smart Benchmarking AI Accelerators using Performance Representatives | Alexander Louis-Ferdinand Jung, Jannik Steinmetz, Jonathan Gietz, Konstantin Lübeck, Oliver Bringmann | 2024-06-12 | 下载 | Statistical models are widely used to estimate the performance of commercial off-the-shelf (COTS) AI hardware accelerators. However, training of statistical performance models often requires vast amou... |
| ONNXim: A Fast, Cycle-level Multi-core NPU Simulator | Hyungkyu Ham, Wonhyuk Yang, Yunseon Shin, Okkyun Woo, Guseul Heo, Sangyeop Lee, Jongse Park, Gwangsun Kim | 2024-06-12 | 下载 | As DNNs are widely adopted in various application domains while demanding increasingly higher compute and memory requirements, designing efficient and performant NPUs (Neural Processing Units) is beco... |