2024-06-03

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Demystifying AI Platform Design for Distributed Inference of Next-Generation LLM models	Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong, Souvik Kundu, Sudarshan Srinivasan, Suvinay Subramanian, Midhilesh Elavazhagan, Madhu Kumar, Tushar Krishna	2024-06-03	下载	Large language models (LLMs) have shown remarkable performance across a wide range of applications, often outperforming human experts. However, deploying these gigantic models efficiently for diverse ...
A 0.96pJ/SOP, 30.23K-neuron/mm^2 Heterogeneous Neuromorphic Chip With Fullerene-like Interconnection Topology for Edge-AI Computing	P. J. Zhou, Q. Yu, M. Chen, Y. C. Wang, L. W. Meng, Y. Zuo, N. Ning, Y. Liu, S. G. Hu, G. C. Qiao	2024-06-03	下载	Edge-AI computing requires high energy efficiency, low power consumption, and relatively high flexibility and compact area, challenging the AI-chip design. This work presents a 0.
ADE-HGNN: Accelerating HGNNs through Attention Disparity Exploitation	Dengke Han, Meng Wu, Runzhen Xue, Mingyu Yan, Xiaochun Ye, Dongrui Fan	2024-06-03	下载	Heterogeneous Graph Neural Networks (HGNNs) have recently demonstrated great power in handling heterogeneous graph data, rendering them widely applied in many critical real-world domains.
VERTECS: A COTS-based payload interface board to enable next generation astronomical imaging payloads	Ezra Fielding, Victor H. Schulz, Keenan A. A. Chatar, Kei Sano, Akitoshi Hanazawa	2024-06-03	下载	Due to advances in observation and imaging technologies, modern astronomical satellites generate large volumes of data. This necessitates efficient onboard data processing and high-speed data downlink...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Efficient Data Distribution Estimation for Accelerated Federated Learning	Yuanli Wang, Lei Huang	2024-06-03	下载	Federated Learning(FL) is a privacy-preserving machine learning paradigm where a global model is trained in-situ across a large number of distributed edge devices.
Optimizing the Optimal Weighted Average: Efficient Distributed Sparse Classification	Fred Lu, Ryan R. Curtin, Edward Raff, Francis Ferraro, James Holt	2024-06-03	下载	While distributed training is often viewed as a solution to optimizing linear models on increasingly large datasets, inter-machine communication costs of popular distributed approaches can dominate as...
A Surprisingly Simple Method for Distributed Euclidean-Minimum Spanning Tree / Single Linkage Dendrogram Construction from High Dimensional Embeddings via Distance Decomposition	Richard Lettich	2024-06-03	下载	We introduce a decomposition method for the distributed calculation of exact Euclidean Minimum Spanning Trees in high dimensions (where sub-quadratic algorithms are not effective), or more generalized...
Demystifying AI Platform Design for Distributed Inference of Next-Generation LLM models	Abhimanyu Bambhaniya, Ritik Raj, Geonhwa Jeong, Souvik Kundu, Sudarshan Srinivasan, Suvinay Subramanian, Midhilesh Elavazhagan, Madhu Kumar, Tushar Krishna	2024-06-03	下载	Large language models (LLMs) have shown remarkable performance across a wide range of applications, often outperforming human experts. However, deploying these gigantic models efficiently for diverse ...
Helix: Serving Large Language Models over Heterogeneous GPUs and Network via Max-Flow	Yixuan Mei, Yonghao Zhuang, Xupeng Miao, Juncheng Yang, Zhihao Jia, Rashmi Vinayak	2024-06-03	下载	This paper introduces Helix, a distributed system for high-throughput, low-latency large language model (LLM) serving in heterogeneous GPU clusters.
Asynchronous Multi-Server Federated Learning for Geo-Distributed Clients	Yuncong Zuo, Bart Cox, Lydia Y. Chen, Jérémie Decouchant	2024-06-03	下载	Federated learning (FL) systems enable multiple clients to train a machine learning model iteratively through synchronously exchanging the intermediate model weights with a single server.
Asynchronous Byzantine Federated Learning	Bart Cox, Abele Mălan, Lydia Y. Chen, Jérémie Decouchant	2024-06-03	下载	Federated learning (FL) enables a set of geographically distributed clients to collectively train a model through a server. Classically, the training process is synchronous, but can be made asynchrono...
Performance comparison of Dask and Apache Spark on HPC systems for Neuroimaging	Mathieu Dugré, Valérie Hayot-Sasson, Tristan Glatard	2024-06-03	下载	The general increase in data size and data sharing motivates the adoption of Big Data strategies in several scientific disciplines. However, while several options are available, no particular guidelin...
sAirflow: Adopting Serverless in a Legacy Workflow Scheduler	Filip Mikina, Pawel Zuk, Krzysztof Rzadca	2024-06-03	下载	Serverless clouds promise efficient scaling, reduced toil and monetary costs. Yet, serverless-ing a complex, legacy application might require major refactoring and thus is risky.
A GPU-ready pseudo-spectral method for direct numerical simulations of multiphase turbulence	Alessio Roccon	2024-06-03	下载	In this work, we detail the GPU-porting of an in-house pseudo-spectral solver tailored towards large-scale simulations of interface-resolved simulation of drop- and bubble-laden turbulent flows.
Structures and Techniques for Streaming Dynamic Graph Processing on Decentralized Message-Driven Systems	Bibrak Qamar Chandio, Maciej Brodowicz, Thomas Sterling	2024-06-03	下载	The paper presents structures and techniques aimed towards co-designing scalable asynchronous and decentralized dynamic graph processing for fine-grain memory-driven architectures.
Formal Definition and Implementation of Reproducibility Tenets for Computational Workflows	Nicholas J. Pritchard, Andreas Wicenec	2024-06-03	下载	Computational workflow management systems power contemporary data-intensive sciences. The slowly resolving reproducibility crisis presents both a sobering warning and an opportunity to iterate on what...
No Vandalism: Privacy-Preserving and Byzantine-Robust Federated Learning	Zhibo Xing, Zijian Zhang, Zi'ang Zhang, Jiamou Liu, Liehuang Zhu, Giovanni Russello	2024-06-03	下载	Federated learning allows several clients to train one machine learning model jointly without sharing private data, providing privacy protection.
An Advanced Reinforcement Learning Framework for Online Scheduling of Deferrable Workloads in Cloud Computing	Hang Dong, Liwen Zhu, Zhao Shan, Bo Qiao, Fangkai Yang, Si Qin, Chuan Luo, Qingwei Lin, Yuwen Yang, Gurpreet Virdi, Saravan Rajmohan, Dongmei Zhang, Thomas Moscibroda	2024-06-03	下载	Efficient resource utilization and perfect user experience usually conflict with each other in cloud computing platforms. Great efforts have been invested in increasing resource utilization but trying...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Non-uniformity is All You Need: Efficient and Timely Encrypted Traffic Classification With ECHO	Shilo Daum, Tal Shapira, Anat Bremler-Barr, David Hay	2024-06-03	下载	With 95% of Internet traffic now encrypted, an effective approach to classifying this traffic is crucial for network security and management. This paper introduces ECHO -- a novel optimization process...
TSpec-LLM: An Open-source Dataset for LLM Understanding of 3GPP Specifications	Rasoul Nikbakht, Mohamed Benzaghta, Giovanni Geraci	2024-06-03	下载	Understanding telecom standards involves sorting through numerous technical documents, such as those produced by the 3rd Generation Partnership Project (3GPP), which is time-consuming and labor-intens...
Experimental comparison of 5G SDR platforms: srsRAN x OpenAirInterface	Ruan P. Alves, Joao Guilherme A. da S. Alves, Mikael R. Camelo, Wilker O. de Feitosa, Victor F. Monteiro, Fco. Rodrigo P. Cavalcanti	2024-06-03	下载	A Software-Defined Radio (SDR) platform is a communication system that implements as software functions that are typically implemented in dedicated hardware.
Conditional Gumbel-Softmax for constrained feature selection with application to node selection in wireless sensor networks	Thomas Strypsteen, Alexander Bertrand	2024-06-03	下载	In this paper, we introduce Conditional Gumbel-Softmax as a method to perform end-to-end learning of the optimal feature subset for a given task and deep neural network (DNN) model, while adhering to ...
Joint Constellation Shaping Using Gradient Descent Approach for MU-MIMO Broadcast Channel	Maxime Vaillant, Alix Jeannerot, Jean-Marie Gorce	2024-06-03	下载	We introduce a learning-based approach to optimize a joint constellation for a multi-user MIMO broadcast channel ( $T$ Tx antennas, $K$ users, each with $R$ Rx antennas), with perfect channel knowledge...
Comparison of 5G Performance Post-Merger between Two Network Operators Using Field Tests in Urban Areas	Surachai Chatchalermpu, Therdpong Daengsi, Pakkasit Sriamorntrakul, Kritphon Phanrattanachai	2024-06-03	下载	In late Q1/2023, DTAC and TRUE officially completed their merger. Consequently, this study was initiated to ascertain whether their respective 5G networks had been seamlessly integrated several months...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Recover as It is Designed to Be: Recovering from Compatibility Mobile App Crashes by Reusing User Flows	Donghwi Kim, Hyungjun Yoon, Chang Min Park, Sujin Han, Youngjin Kwon, Steven Y. Ko, Sung-Ju Lee	2024-06-03	下载	Android OS is severely fragmented by API updates and device vendors' OS customization, creating a market condition where vastly different OS versions coexist.
SVFF: An Automated Framework for SR-IOV Virtual Function Management in FPGA Accelerated Virtualized Environments	Stefano Cirici, Michele Paolino, Daniel Raho	2024-06-03	下载	FPGA accelerator devices have emerged as a powerful platform for implementing high-performance and scalable solutions in a wide range of industries, leveraging their reconfigurability and virtualizati...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Impact of Generative AI (Large Language Models) on the PRA model construction and maintenance, observations	Valentin Rychkov, Claudia Picoco, Emilie Caleca	2024-06-03	下载	The rapid development of Large Language Models (LLMs) and Generative Pre-Trained Transformers(GPTs) in the field of Generative Artificial Intelligence (AI) can significantly impact task automation in ...