2024-01-18

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration	Jinming Zhuang, Zhuoping Yang, Shixin Ji, Heng Huang, Alex K. Jones, Jingtong Hu, Yiyu Shi, Peipei Zhou	2024-01-18	下载	With the increase in the computation intensity of the chip, the mismatch between computation layer shapes and the available computation resource significantly limits the utilization of the chip.
Using LLM such as ChatGPT for Designing and Implementing a RISC Processor: Execution,Challenges and Limitations	Shadeeb Hossain, Aayush Gohil, Yizhou Wang	2024-01-18	下载	This paper discusses the feasibility of using Large Language Models LLM for code generation with a particular application in designing an RISC.
Analyzing and Improving Hardware Modeling of Accel-Sim	Rodrigo Huerta, Mojtaba Abaie Shoushtary, Antonio González	2024-01-18	下载	GPU architectures have become popular for executing general-purpose programs. Their many-core architecture supports a large number of threads that run concurrently to hide the latency among dependent ...
BlockAMC: Scalable In-Memory Analog Matrix Computing for Solving Linear Systems	Lunshuai Pan, Pushen Zuo, Yubiao Luo, Zhong Sun, Ru Huang	2024-01-18	下载	Recently, in-memory analog matrix computing (AMC) with nonvolatile resistive memory has been developed for solving matrix problems in one step, e.g., matrix inversion of solving linear systems.
A Survey on Hardware Accelerators for Large Language Models	Christoforos Kachris	2024-01-18	下载	Large Language Models (LLMs) have emerged as powerful tools for natural language processing tasks, revolutionizing the field with their ability to understand and generate human-like text.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Foundation Models in Federated Learning: Assessing Backdoor Vulnerabilities	Xi Li, Chen Wu, Jiaqi Wang	2024-01-18	下载	Federated Learning (FL), a privacy-preserving machine learning framework, faces significant data-related challenges. For example, the lack of suitable public datasets leads to ineffective information ...
TALICS $^3$ : Tape Library Cloud Storage System Simulator	Suayb S. Arslan, James Peng, Turguy Goker	2024-01-18	下载	High performance computing data is surging fast into the exabyte-scale world, where tape libraries are the main platform for long-term durable data storage besides high-cost DNA.
Autobahn: Seamless high speed BFT	Neil Giridharan, Florian Suri-Payer, Ittai Abraham, Lorenzo Alvisi, Natacha Crooks	2024-01-18	下载	Today's practical, high performance Byzantine Fault Tolerant (BFT) consensus protocols operate in the partial synchrony model. However, existing protocols are inefficient when deployments are indeed p...
gFaaS: Enabling Generic Functions in Serverless Computing	Mohak Chadha, Paul Wieland, Michael Gerndt	2024-01-18	下载	With the advent of AWS Lambda in 2014, Serverless Computing, particularly Function-as-a-Service (FaaS), has witnessed growing popularity across various application domains.
Hierarchical Federated Learning in Multi-hop Cluster-Based VANETs	M. Saeid HaghighiFard, Sinem Coleri	2024-01-18	下载	The usage of federated learning (FL) in Vehicular Ad hoc Networks (VANET) has garnered significant interest in research due to the advantages of reducing transmission overhead and protecting user priv...
Towards providing reliable job completion time predictions using PCS	Abdullah Bin Faisal, Noah Martin, Hafiz Mohsin Bashir, Swaminathan Lamelas, Fahad R. Dogar	2024-01-18	下载	In this paper we build a case for providing job completion time predictions to cloud users, similar to the delivery date of a package or arrival time of a booked ride.
Fast Kronecker Matrix-Matrix Multiplication on GPUs	Abhinav Jangda, Mohit Yadav	2024-01-18	下载	Kronecker Matrix-Matrix Multiplication (Kron-Matmul) is the multiplication of a matrix with the Kronecker Product of several smaller matrices.
IoT-Based Wireless Networkingfor Seismic Applications	Hadi Jamali-Rad, Xander Campman	2024-01-18	下载	We propose to employ a recently developed IoT-based wireless technology, so called low-power wide-area networks (LPWANs), to exploit their long range, low power, and inherent compatibility to cloud st...
Techniques for Authenticating Quantile Digests	Alessandro Scala	2024-01-18	下载	We investigate two possible techniques to authenticate the q-digest data structure, along with a worst-case study of the computational complexity both in time and space of the proposed solutions, and ...
PyRQA -- Conducting Recurrence Quantification Analysis on Very Long Time Series Efficiently	Tobias Rawald, Mike Sips, Norbert Marwan	2024-01-18	下载	PyRQA is a software package that efficiently conducts recurrence quantification analysis (RQA) on time series consisting of more than one million data points.
GPU Acceleration of a Conjugate Exponential Model for Cancer Tissue Heterogeneity	Anik Chaudhuri, Anwoy Mohanty, Manoranjan Satpathy	2024-01-18	下载	Heterogeneity in the cell population of cancer tissues poses many challenges in cancer diagnosis and treatment. Studying the heterogeneity in cell populations from gene expression measurement data in ...
BlockAMC: Scalable In-Memory Analog Matrix Computing for Solving Linear Systems	Lunshuai Pan, Pushen Zuo, Yubiao Luo, Zhong Sun, Ru Huang	2024-01-18	下载	Recently, in-memory analog matrix computing (AMC) with nonvolatile resistive memory has been developed for solving matrix problems in one step, e.g., matrix inversion of solving linear systems.
Deep Back-Filling: a Split Window Technique for Deep Online Cluster Job Scheduling	Lingfei Wang, Aaron Harwood, Maria A. Rodriguez	2024-01-18	下载	Job scheduling is a critical component of workload management systems that can significantly influence system performance, e.g., in HPC clusters.
ASA -- The Adaptive Scheduling Algorithm	Abel Souza, Kristiaan Pelckmans, Devarshi Ghoshal, Lavanya Ramakrishnan, Johan Tordsson	2024-01-18	下载	In High Performance Computing (HPC) infrastructures, the control of resources by batch systems can lead to prolonged queue waiting times and adverse effects on the overall execution times of applicati...
A HPC Co-Scheduler with Reinforcement Learning	Abel Souza, Kristiaan Pelckmans, Johan Tordsson	2024-01-18	下载	Although High Performance Computing (HPC) users understand basic resource requirements such as the number of CPUs and memory limits, internal infrastructural utilization data is exclusively leveraged ...
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving	Yinmin Zhong, Shengyu Liu, Junda Chen, Jianbo Hu, Yibo Zhu, Xuanzhe Liu, Xin Jin, Hao Zhang	2024-01-18	下载	DistServe improves the performance of large language models (LLMs) serving by disaggregating the prefill and decoding computation. Existing LLM serving systems colocate the two phases and batch the co...
Mobility Accelerates Learning: Convergence Analysis on Hierarchical Federated Learning in Vehicular Networks	Tan Chen, Jintao Yan, Yuxuan Sun, Sheng Zhou, Deniz Gündüz, Zhisheng Niu	2024-01-18	下载	Hierarchical federated learning (HFL) enables distributed training of models across multiple devices with the help of several edge servers and a cloud edge server in a privacy-preserving manner.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
On the Interplay Between Network Metrics and Performance of Mobile Edge Offloading	Parisa Fard Moshiri, Murat Simsek, Burak Kantarci	2024-01-18	下载	Multi-Access Edge Computing (MEC) emerged as a viable computing allocation method that facilitates offloading tasks to edge servers for efficient processing.
Bypassing a Reactive Jammer via NOMA-Based Transmissions in Critical Missions	Mohammadreza Amini, Ghazal Asemian, Michel Kulhandjian, Burak Kantarci, Claude D'Amours, Melike Erol-Kantarci	2024-01-18	下载	Wireless networks can be vulnerable to radio jamming attacks. The quality of service under a jamming attack is not guaranteed and the service requirements such as reliability, latency, and effective r...
Node Placement and Path Planning for Improved Area Coverage in Mixed Wireless Sensor Networks	Survi Kumari, Seshan Srirangarajan	2024-01-18	下载	For the large-scale monitoring of a physical phenomena using a wireless sensor network (WSN), a large number of static and/or mobile sensor nodes are required, resulting in higher deployment cost.
HRL-TSCH: A Hierarchical Reinforcement Learning-based TSCH Scheduler for IIoT	F. Fernando Jurado-Lasso, Charalampos Orfanidis, J. F. Jurado, Xenofon Fafoutis	2024-01-18	下载	The Industrial Internet of Things (IIoT) demands adaptable Networked Embedded Systems (NES) for optimal performance. Combined with recent advances in Artificial Intelligence (AI), tailored solutions c...
Hierarchical Federated Learning in Multi-hop Cluster-Based VANETs	M. Saeid HaghighiFard, Sinem Coleri	2024-01-18	下载	The usage of federated learning (FL) in Vehicular Ad hoc Networks (VANET) has garnered significant interest in research due to the advantages of reducing transmission overhead and protecting user priv...
Tailoring Semantic Communication at Network Edge: A Novel Approach Using Dynamic Knowledge Distillation	Abdullatif Albaseer, Mohamed Abdallah	2024-01-18	下载	Semantic Communication (SemCom) systems, empowered by deep learning (DL), represent a paradigm shift in data transmission. These systems prioritize the significance of content over sheer data volume.
Model-Assisted Learning for Adaptive Cooperative Perception of Connected Autonomous Vehicles	Kaige Qu, Weihua Zhuang, Qiang Ye, Wen Wu, Xuemin Shen	2024-01-18	下载	Cooperative perception (CP) is a key technology to facilitate consistent and accurate situational awareness for connected and autonomous vehicles (CAVs).
EDAF: An End-to-End Delay Analytics Framework for 5G-and-Beyond Networks	Samie Mostafavi, Marius Tillner, Gourav Prateek Sharma, James Gross	2024-01-18	下载	Supporting applications in emerging domains like cyber-physical systems and human-in-the-loop scenarios typically requires adherence to strict end-to-end delay guarantees.
Learning Non-myopic Power Allocation in Constrained Scenarios	Arindam Chowdhury, Santiago Paternain, Gunjan Verma, Ananthram Swami, Santiago Segarra	2024-01-18	下载	We propose a learning-based framework for efficient power allocation in ad hoc interference networks under episodic constraints. The problem of optimal power allocation -- for maximizing a given netwo...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Accurate and Scalable Many-Node Simulation	Stijn Eyerman, Wim Heirman, Kristof Du Bois, Ibrahim Hur	2024-01-18	下载	Accurate performance estimation of future many-node machines is challenging because it requires detailed simulation models of both node and network.