2024-06-12

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference	Christopher Wolters, Xiaoxuan Yang, Ulf Schlichtmann, Toyotaro Suzumura	2024-06-12	下载	Large language models (LLMs) have recently transformed natural language processing, enabling machines to generate human-like text and engage in meaningful conversations.
Continuous-Time Digital Twin with Analogue Memristive Neural Ordinary Differential Equation Solver	Hegan Chen, Jichang Yang, Jia Chen, Songqi Wang, Shaocong Wang, Dingchen Wang, Xinyu Tian, Yifei Yu, Xi Chen, Yinan Lin, Yangu He, Xiaoshan Wu, Yi Li, Xinyuan Zhang, Ning Lin, Meng Xu, Yi Li, Xumeng Zhang, Zhongrui Wang, Han Wang, Dashan Shang, Qi Liu, Kwang-Ting Cheng, Ming Liu	2024-06-12	下载	Digital twins, the cornerstone of Industry 4.0, replicate real-world entities through computer models, revolutionising fields such as manufacturing management and industrial automation.
It's all about PR -- Smart Benchmarking AI Accelerators using Performance Representatives	Alexander Louis-Ferdinand Jung, Jannik Steinmetz, Jonathan Gietz, Konstantin Lübeck, Oliver Bringmann	2024-06-12	下载	Statistical models are widely used to estimate the performance of commercial off-the-shelf (COTS) AI hardware accelerators. However, training of statistical performance models often requires vast amou...
Hypervisor Extension for a RISC-V Processor	Jaume Gauchola, JuanJosé Costa, Enric Morancho, Ramon Canal, Xavier Carril, Max Doblas, Beatriz Otero, Alex Pajuelo, Eva Rodríguez, Javier Salamero, Javier Verdú	2024-06-12	下载	This paper describes our experience implementing a Hypervisor extension for a 64-bit RISC-V processor. We describe the design process and the main required parts with a brief explanation of each one.
ONNXim: A Fast, Cycle-level Multi-core NPU Simulator	Hyungkyu Ham, Wonhyuk Yang, Yunseon Shin, Okkyun Woo, Guseul Heo, Sangyeop Lee, Jongse Park, Gwangsun Kim	2024-06-12	下载	As DNNs are widely adopted in various application domains while demanding increasingly higher compute and memory requirements, designing efficient and performant NPUs (Neural Processing Units) is beco...
Hardware Implementation of Soft Mapper/Demappers in Iterative EP-based Receivers	Ian Fischer Schilling, Serdar Sahin, Camille Leroux, Antonio Maria Cipriano, Christophe Jego	2024-06-12	下载	This paper presents a comprehensive study and implementations onto FPGA device of an Expectation Propagation (EP)-based receiver for QPSK, 8-PSK, and 16-QAM.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
PETSc/TAO Developments for GPU-Based Early Exascale Systems	Richard Tran Mills, Mark Adams, Satish Balay, Jed Brown, Jacob Faibussowitsch, Toby Isaac, Matthew Knepley, Todd Munson, Hansol Suh, Stefano Zampini, Hong Zhang, Junchao Zhang	2024-06-12	下载	The Portable Extensible Toolkit for Scientific Computation (PETSc) library provides scalable solvers for nonlinear time-dependent differential and algebraic equations and for numerical optimization vi...
Nonconvex Federated Learning on Compact Smooth Submanifolds With Heterogeneous Data	Jiaojiao Zhang, Jiang Hu, Anthony Man-Cho So, Mikael Johansson	2024-06-12	下载	Many machine learning tasks, such as principal component analysis and low-rank matrix completion, give rise to manifold optimization problems.
ProTrain: Efficient LLM Training via Memory-Aware Techniques	Hanmei Yang, Jin Zhou, Yao Fu, Xiaoqun Wang, Ramine Roane, Hui Guan, Tongping Liu	2024-06-12	下载	It is extremely memory-hungry to train Large Language Models (LLM). To solve this problem, existing work exploits the combination of CPU and GPU for the training process, such as ZeRO-Offload.
A deep cut into Split Federated Self-supervised Learning	Marcin Przewięźlikowski, Marcin Osial, Bartosz Zieliński, Marek Śmieja	2024-06-12	下载	Collaborative self-supervised learning has recently become feasible in highly distributed environments by dividing the network layers between client devices and a central server.
Vitamin-V: Expanding Open-Source RISC-V Cloud Environments	Ramon Canal, Stefano Di Carlo, Dimitris Gizopoulos, Alberto Scionti, Francesco Lubrano, Josep-Lluís Berral, Aaron Call, Diego Marron, Konstantinos Nikas, Dionisios Pnevmatikatos, Daniel Raho, Alvise Rigo, Yannis Papaefstathiou, José María Arnau, Angelos Arelakis	2024-06-12	下载	Among the key contributions of Vitamin-V (2023-2025 Horizon Europe project), we develop a complete RISC-V open-source software stack for cloud services with comparable performance to the cloud-dominan...
Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey	Feng Liang, Zhen Zhang, Haifeng Lu, Chengming Li, Victor C. M. Leung, Yanyi Guo, Xiping Hu	2024-06-12	下载	With rapidly increasing distributed deep learning workloads in large-scale data centers, efficient distributed deep learning framework strategies for resource allocation and workload scheduling have b...
IMFL-AIGC: Incentive Mechanism Design for Federated Learning Empowered by Artificial Intelligence Generated Content	Guangjing Huang, Qiong Wu, Jingyi Li, Xu Chen	2024-06-12	下载	Federated learning (FL) has emerged as a promising paradigm that enables clients to collaboratively train a shared global model without uploading their local data.
Federated Incomplete Multi-View Clustering with Heterogeneous Graph Neural Networks	Xueming Yan, Ziqi Wang, Yaochu Jin	2024-06-12	下载	Federated multi-view clustering offers the potential to develop a global clustering model using data distributed across multiple devices. However, current methods face challenges due to the absence of...
Emergent Peer-to-Peer Multi-Hub Topology	Mohamed Amine Legheraba, Maria Potop-Butucaru, Sébastien Tixeuil, Serge Fdida	2024-06-12	下载	In this paper we propose and evaluate an innovative algorithm that enables the creation of Peer-to-Peer network overlays characterized by emergent multi-hubs.
Computational Performance and Energy Efficiency of ARM based HPC servers	Oskar Schirmer	2024-06-12	下载	HPC world is dominated by x86 ISA CPUs. This monoculture is not necessarily justified by best performance evaluation, but may inherit from e.g. SW related restrictions on the choice of HW platforms.
FDLoRA: Personalized Federated Learning of Large Language Model via Dual LoRA Tuning	Jiaxing QI, Zhongzhi Luan, Shaohan Huang, Carol Fung, Hailong Yang, Depei Qian	2024-06-12	下载	Large language models (LLMs) have emerged as important components across various fields, yet their training requires substantial computation resources and abundant labeled data.
Class-Wise Federated Averaging for Efficient Personalization	Gyuejeong Lee, Daeyoung Choi	2024-06-12	下载	Federated learning (FL) enables collaborative model training across distributed clients without centralizing data. However, existing approaches such as Federated Averaging (FedAvg) often perform poorl...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Enhancing Path Selections with Interference Graphs in Multihop Relay Wireless Networks	Cao Vien Phung, Andre Drummond, Admela Jukan	2024-06-12	下载	The multihop relay wireless networks have gained traction due to the emergence of Reconfigurable Intelligent Surfaces (RISs) which can be used as relays in high frequency range wireless network, inclu...
Machine Learning-Driven Open-Source Framework for Assessing QoE in Multimedia Networks	Parsa Hassani Shariat Panahi, Amir Hossein Jalilvand, Abolfazl Diyanat	2024-06-12	下载	The Internet is integral to modern life, influencing communication, business, and lifestyles globally. As dependence on Internet services grows, the demand for high-quality service delivery increases.
MSADM: Large Language Model (LLM) Assisted End-to-End Network Health Management Based on Multi-Scale Semanticization	Fengxiao Tang, Xiaonan Wang, Xun Yuan, Linfeng Luo, Ming Zhao, Tianchi Huang, Nei Kato	2024-06-12	下载	Network device and system health management is the foundation of modern network operations and maintenance. Traditional health management methods, relying on expert identification or simple rule-based...
Systematic Literature Review of AI-enabled Spectrum Management in 6G and Future Networks	Bushra Sabir, Shuiqiao Yang, David Nguyen, Nan Wu, Alsharif Abuadbba, Hajime Suzuki, Shangqi Lai, Wei Ni, Ding Ming, Surya Nepal	2024-06-12	下载	Artificial Intelligence (AI) has advanced significantly in various domains like healthcare, finance, and cybersecurity, with successes such as DeepMind's medical imaging and Tesla's autonomous vehicle...
Efficient Network Traffic Feature Sets for IoT Intrusion Detection	Miguel Silva, João Vitorino, Eva Maia, Isabel Praça	2024-06-12	下载	The use of Machine Learning (ML) models in cybersecurity solutions requires high-quality data that is stripped of redundant, missing, and noisy information.
Semantic-Aware Resource Allocation Based on Deep Reinforcement Learning for 5G-V2X HetNets	Zhiyu Shao, Qiong Wu, Pingyi Fan, Nan Cheng, Qiang Fan, Jiangzhou Wang	2024-06-12	下载	This letter proposes a semantic-aware resource allocation (SARA) framework with flexible duty cycle (DC) coexistence mechanism (SARADC) for 5G-V2X Heterogeneous Network (HetNets) based on deep reinfor...
Demonstration of Safe Electromagnetic Radiation Emitted by 5G Active Antenna Systems	Sumit Kumar, Chandan Kumar Sheemar, Abdelrahman Astro, Jorge Querol, Symeon Chatzinotas	2024-06-12	下载	The careful planning and safe deployment of 5G technologies will bring enormous benefits to society and the economy. Higher frequency, beamforming, and small-cells are key technologies that will provi...
Content Provider Contributions to Capacity Expansion of a Neutral ISP: Effect of Private Option	Pranay Agarwal, D. Manjunath	2024-06-12	下载	Increasing content consumption by users and the expectation of a better Internet experience requires Internet service providers (ISPs) to expand the capacity of the access network continually.
Toward Enhanced Reinforcement Learning-Based Resource Management via Digital Twin: Opportunities, Applications, and Challenges	Nan Cheng, Xiucheng Wang, Zan Li, Zhisheng Yin, Tom Luan, Xuemin Shen	2024-06-12	下载	This article presents a digital twin (DT)-enhanced reinforcement learning (RL) framework aimed at optimizing performance and reliability in network resource management, since the traditional RL method...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
ProTrain: Efficient LLM Training via Memory-Aware Techniques	Hanmei Yang, Jin Zhou, Yao Fu, Xiaoqun Wang, Ramine Roane, Hui Guan, Tongping Liu	2024-06-12	下载	It is extremely memory-hungry to train Large Language Models (LLM). To solve this problem, existing work exploits the combination of CPU and GPU for the training process, such as ZeRO-Offload.
It's all about PR -- Smart Benchmarking AI Accelerators using Performance Representatives	Alexander Louis-Ferdinand Jung, Jannik Steinmetz, Jonathan Gietz, Konstantin Lübeck, Oliver Bringmann	2024-06-12	下载	Statistical models are widely used to estimate the performance of commercial off-the-shelf (COTS) AI hardware accelerators. However, training of statistical performance models often requires vast amou...
ONNXim: A Fast, Cycle-level Multi-core NPU Simulator	Hyungkyu Ham, Wonhyuk Yang, Yunseon Shin, Okkyun Woo, Guseul Heo, Sangyeop Lee, Jongse Park, Gwangsun Kim	2024-06-12	下载	As DNNs are widely adopted in various application domains while demanding increasingly higher compute and memory requirements, designing efficient and performant NPUs (Neural Processing Units) is beco...