Skip to content

2024-01-18

cs.AR - Architecture

标题作者发布日期PDF摘要
SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer AccelerationJinming Zhuang, Zhuoping Yang, Shixin Ji, Heng Huang, Alex K. Jones, Jingtong Hu, Yiyu Shi, Peipei Zhou2024-01-18下载With the increase in the computation intensity of the chip, the mismatch between computation layer shapes and the available computation resource significantly limits the utilization of the chip.
Using LLM such as ChatGPT for Designing and Implementing a RISC Processor: Execution,Challenges and LimitationsShadeeb Hossain, Aayush Gohil, Yizhou Wang2024-01-18下载This paper discusses the feasibility of using Large Language Models LLM for code generation with a particular application in designing an RISC.
Analyzing and Improving Hardware Modeling of Accel-SimRodrigo Huerta, Mojtaba Abaie Shoushtary, Antonio González2024-01-18下载GPU architectures have become popular for executing general-purpose programs. Their many-core architecture supports a large number of threads that run concurrently to hide the latency among dependent ...
BlockAMC: Scalable In-Memory Analog Matrix Computing for Solving Linear SystemsLunshuai Pan, Pushen Zuo, Yubiao Luo, Zhong Sun, Ru Huang2024-01-18下载Recently, in-memory analog matrix computing (AMC) with nonvolatile resistive memory has been developed for solving matrix problems in one step, e.g., matrix inversion of solving linear systems.
A Survey on Hardware Accelerators for Large Language ModelsChristoforos Kachris2024-01-18下载Large Language Models (LLMs) have emerged as powerful tools for natural language processing tasks, revolutionizing the field with their ability to understand and generate human-like text.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Foundation Models in Federated Learning: Assessing Backdoor VulnerabilitiesXi Li, Chen Wu, Jiaqi Wang2024-01-18下载Federated Learning (FL), a privacy-preserving machine learning framework, faces significant data-related challenges. For example, the lack of suitable public datasets leads to ineffective information ...
TALICS3^3: Tape Library Cloud Storage System SimulatorSuayb S. Arslan, James Peng, Turguy Goker2024-01-18下载High performance computing data is surging fast into the exabyte-scale world, where tape libraries are the main platform for long-term durable data storage besides high-cost DNA.
Autobahn: Seamless high speed BFTNeil Giridharan, Florian Suri-Payer, Ittai Abraham, Lorenzo Alvisi, Natacha Crooks2024-01-18下载Today's practical, high performance Byzantine Fault Tolerant (BFT) consensus protocols operate in the partial synchrony model. However, existing protocols are inefficient when deployments are indeed p...
gFaaS: Enabling Generic Functions in Serverless ComputingMohak Chadha, Paul Wieland, Michael Gerndt2024-01-18下载With the advent of AWS Lambda in 2014, Serverless Computing, particularly Function-as-a-Service (FaaS), has witnessed growing popularity across various application domains.
Hierarchical Federated Learning in Multi-hop Cluster-Based VANETsM. Saeid HaghighiFard, Sinem Coleri2024-01-18下载The usage of federated learning (FL) in Vehicular Ad hoc Networks (VANET) has garnered significant interest in research due to the advantages of reducing transmission overhead and protecting user priv...
Towards providing reliable job completion time predictions using PCSAbdullah Bin Faisal, Noah Martin, Hafiz Mohsin Bashir, Swaminathan Lamelas, Fahad R. Dogar2024-01-18下载In this paper we build a case for providing job completion time predictions to cloud users, similar to the delivery date of a package or arrival time of a booked ride.
Fast Kronecker Matrix-Matrix Multiplication on GPUsAbhinav Jangda, Mohit Yadav2024-01-18下载Kronecker Matrix-Matrix Multiplication (Kron-Matmul) is the multiplication of a matrix with the Kronecker Product of several smaller matrices.
IoT-Based Wireless Networkingfor Seismic ApplicationsHadi Jamali-Rad, Xander Campman2024-01-18下载We propose to employ a recently developed IoT-based wireless technology, so called low-power wide-area networks (LPWANs), to exploit their long range, low power, and inherent compatibility to cloud st...
Techniques for Authenticating Quantile DigestsAlessandro Scala2024-01-18下载We investigate two possible techniques to authenticate the q-digest data structure, along with a worst-case study of the computational complexity both in time and space of the proposed solutions, and ...
PyRQA -- Conducting Recurrence Quantification Analysis on Very Long Time Series EfficientlyTobias Rawald, Mike Sips, Norbert Marwan2024-01-18下载PyRQA is a software package that efficiently conducts recurrence quantification analysis (RQA) on time series consisting of more than one million data points.
GPU Acceleration of a Conjugate Exponential Model for Cancer Tissue HeterogeneityAnik Chaudhuri, Anwoy Mohanty, Manoranjan Satpathy2024-01-18下载Heterogeneity in the cell population of cancer tissues poses many challenges in cancer diagnosis and treatment. Studying the heterogeneity in cell populations from gene expression measurement data in ...
BlockAMC: Scalable In-Memory Analog Matrix Computing for Solving Linear SystemsLunshuai Pan, Pushen Zuo, Yubiao Luo, Zhong Sun, Ru Huang2024-01-18下载Recently, in-memory analog matrix computing (AMC) with nonvolatile resistive memory has been developed for solving matrix problems in one step, e.g., matrix inversion of solving linear systems.
Deep Back-Filling: a Split Window Technique for Deep Online Cluster Job SchedulingLingfei Wang, Aaron Harwood, Maria A. Rodriguez2024-01-18下载Job scheduling is a critical component of workload management systems that can significantly influence system performance, e.g., in HPC clusters.
ASA -- The Adaptive Scheduling AlgorithmAbel Souza, Kristiaan Pelckmans, Devarshi Ghoshal, Lavanya Ramakrishnan, Johan Tordsson2024-01-18下载In High Performance Computing (HPC) infrastructures, the control of resources by batch systems can lead to prolonged queue waiting times and adverse effects on the overall execution times of applicati...
A HPC Co-Scheduler with Reinforcement LearningAbel Souza, Kristiaan Pelckmans, Johan Tordsson2024-01-18下载Although High Performance Computing (HPC) users understand basic resource requirements such as the number of CPUs and memory limits, internal infrastructural utilization data is exclusively leveraged ...
DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model ServingYinmin Zhong, Shengyu Liu, Junda Chen, Jianbo Hu, Yibo Zhu, Xuanzhe Liu, Xin Jin, Hao Zhang2024-01-18下载DistServe improves the performance of large language models (LLMs) serving by disaggregating the prefill and decoding computation. Existing LLM serving systems colocate the two phases and batch the co...
Mobility Accelerates Learning: Convergence Analysis on Hierarchical Federated Learning in Vehicular NetworksTan Chen, Jintao Yan, Yuxuan Sun, Sheng Zhou, Deniz Gündüz, Zhisheng Niu2024-01-18下载Hierarchical federated learning (HFL) enables distributed training of models across multiple devices with the help of several edge servers and a cloud edge server in a privacy-preserving manner.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
On the Interplay Between Network Metrics and Performance of Mobile Edge OffloadingParisa Fard Moshiri, Murat Simsek, Burak Kantarci2024-01-18下载Multi-Access Edge Computing (MEC) emerged as a viable computing allocation method that facilitates offloading tasks to edge servers for efficient processing.
Bypassing a Reactive Jammer via NOMA-Based Transmissions in Critical MissionsMohammadreza Amini, Ghazal Asemian, Michel Kulhandjian, Burak Kantarci, Claude D'Amours, Melike Erol-Kantarci2024-01-18下载Wireless networks can be vulnerable to radio jamming attacks. The quality of service under a jamming attack is not guaranteed and the service requirements such as reliability, latency, and effective r...
Node Placement and Path Planning for Improved Area Coverage in Mixed Wireless Sensor NetworksSurvi Kumari, Seshan Srirangarajan2024-01-18下载For the large-scale monitoring of a physical phenomena using a wireless sensor network (WSN), a large number of static and/or mobile sensor nodes are required, resulting in higher deployment cost.
HRL-TSCH: A Hierarchical Reinforcement Learning-based TSCH Scheduler for IIoTF. Fernando Jurado-Lasso, Charalampos Orfanidis, J. F. Jurado, Xenofon Fafoutis2024-01-18下载The Industrial Internet of Things (IIoT) demands adaptable Networked Embedded Systems (NES) for optimal performance. Combined with recent advances in Artificial Intelligence (AI), tailored solutions c...
Hierarchical Federated Learning in Multi-hop Cluster-Based VANETsM. Saeid HaghighiFard, Sinem Coleri2024-01-18下载The usage of federated learning (FL) in Vehicular Ad hoc Networks (VANET) has garnered significant interest in research due to the advantages of reducing transmission overhead and protecting user priv...
Tailoring Semantic Communication at Network Edge: A Novel Approach Using Dynamic Knowledge DistillationAbdullatif Albaseer, Mohamed Abdallah2024-01-18下载Semantic Communication (SemCom) systems, empowered by deep learning (DL), represent a paradigm shift in data transmission. These systems prioritize the significance of content over sheer data volume.
Model-Assisted Learning for Adaptive Cooperative Perception of Connected Autonomous VehiclesKaige Qu, Weihua Zhuang, Qiang Ye, Wen Wu, Xuemin Shen2024-01-18下载Cooperative perception (CP) is a key technology to facilitate consistent and accurate situational awareness for connected and autonomous vehicles (CAVs).
EDAF: An End-to-End Delay Analytics Framework for 5G-and-Beyond NetworksSamie Mostafavi, Marius Tillner, Gourav Prateek Sharma, James Gross2024-01-18下载Supporting applications in emerging domains like cyber-physical systems and human-in-the-loop scenarios typically requires adherence to strict end-to-end delay guarantees.
Learning Non-myopic Power Allocation in Constrained ScenariosArindam Chowdhury, Santiago Paternain, Gunjan Verma, Ananthram Swami, Santiago Segarra2024-01-18下载We propose a learning-based framework for efficient power allocation in ad hoc interference networks under episodic constraints. The problem of optimal power allocation -- for maximizing a given netwo...

cs.PF - Performance

标题作者发布日期PDF摘要
Accurate and Scalable Many-Node SimulationStijn Eyerman, Wim Heirman, Kristof Du Bois, Ibrahim Hur2024-01-18下载Accurate performance estimation of future many-node machines is challenging because it requires detailed simulation models of both node and network.

基于 VitePress 构建