2024-09-03

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Global Optimizations & Lightweight Dynamic Logic for Concurrency	Suchita Pati, Shaizeen Aga, Nuwan Jayasena, Matthew D. Sinclair	2024-09-03	下载	Modern accelerators like GPUs are increasingly executing independent operations concurrently to improve the device's compute utilization. However, effectively harnessing it on GPUs for important primi...
AIvril: AI-Driven RTL Generation With Verification In-The-Loop	Mubashir ul Islam, Humza Sami, Pierre-Emmanuel Gaillardon, Valerio Tenace	2024-09-03	下载	Large Language Models (LLMs) are computational models capable of performing complex natural language processing tasks. Leveraging these capabilities, LLMs hold the potential to transform the entire ha...
Exploiting the Vulnerability of Large Language Models via Defense-Aware Architectural Backdoor	Abdullah Arafat Miah, Yu Bi	2024-09-03	下载	Deep neural networks (DNNs) have long been recognized as vulnerable to backdoor attacks. By providing poisoned training data in the fine-tuning process, the attacker can implant a backdoor into the vi...
The Impact of Run-Time Variability on Side-Channel Attacks Targeting FPGAs	Davide Galli, Adriano Guarisco, William Fornaciari, Matteo Matteucci, Davide Zoni	2024-09-03	下载	To defeat side-channel attacks, many recent countermeasures work by enforcing random run-time variability to the target computing platform in terms of clock jitters, frequency and voltage scaling, and...
Reuse and Blend: Energy-Efficient Optical Neural Network Enabled by Weight Sharing	Bo Xu, Yuetong Fang, Shaoliang Yu, Renjing Xu	2024-09-03	下载	Optical neural networks (ONN) based on micro-ring resonators (MRR) have emerged as a promising alternative to significantly accelerating the massive matrix-vector multiplication (MVM) operations in ar...
Toward Capturing Genetic Epistasis From Multivariate Genome-Wide Association Studies Using Mixed-Precision Kernel Ridge Regression	Hatem Ltaief, Rabab Alomairy, Qinglei Cao, Jie Ren, Lotfi Slim, Thorsten Kurth, Benedikt Dorschner, Salim Bougouffa, Rached Abdelkhalak, David E. Keyes	2024-09-03	下载	We exploit the widening margin in tensor-core performance between [FP64/FP32/FP16/INT8,FP64/FP32/FP16/FP8/INT8] on NVIDIA [Ampere,Hopper] GPUs to boost the performance of output accuracy-preserving mi...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Accelerating Fortran Codes: A Method for Integrating Coarray Fortran with CUDA Fortran and OpenMP	James McKevitt, Eduard I. Vorobyov, Igor Kulikov	2024-09-03	下载	Fortran's prominence in scientific computing requires strategies to ensure both that legacy codes are efficient on high-performance computing systems, and that the language remains attractive for the ...
Global Optimizations & Lightweight Dynamic Logic for Concurrency	Suchita Pati, Shaizeen Aga, Nuwan Jayasena, Matthew D. Sinclair	2024-09-03	下载	Modern accelerators like GPUs are increasingly executing independent operations concurrently to improve the device's compute utilization. However, effectively harnessing it on GPUs for important primi...
Quantifying Liveness and Safety of Avalanche's Snowball	Quentin Kniep, Maxime Laval, Jakub Sliwinski, Roger Wattenhofer	2024-09-03	下载	This work examines the resilience properties of the Snowball and Avalanche protocols that underlie the popular Avalanche blockchain. We experimentally quantify the resilience of Snowball using a simul...
Checkpoint and Restart: An Energy Consumption Characterization in Clusters	Marina Moran, Javier Balladini, Dolores Rexachs, Emilio Luque	2024-09-03	下载	The fault tolerance method currently used in High Performance Computing (HPC) is the rollback-recovery method by using checkpoints. This, like any other fault tolerance method, adds an additional ener...
Cache Coherence Over Disaggregated Memory	Ruihong Wang, Jianguo Wang, Walid G. Aref	2024-09-03	下载	Disaggregating memory from compute offers the opportunity to better utilize stranded memory in cloud data centers. It is important to cache data in the compute nodes and maintain cache coherence acros...
EcoLife: Carbon-Aware Serverless Function Scheduling for Sustainable Computing	Yankai Jiang, Rohan Basu Roy, Baolin Li, Devesh Tiwari	2024-09-03	下载	This work introduces ECOLIFE, the first carbon-aware serverless function scheduler to co-optimize carbon footprint and performance. ECOLIFE builds on the key insight of intelligently exploiting multi-...
FedMinds: Privacy-Preserving Personalized Brain Visual Decoding	Guangyin Bao, Duoqian Miao	2024-09-03	下载	Exploring the mysteries of the human brain is a long-term research topic in neuroscience. With the help of deep learning, decoding visual information from human brain activity fMRI has achieved promis...
When Digital Twin Meets 6G: Concepts, Obstacles, and Research Prospects	Wenshuai Liu, Yaru Fu, Zheng Shi, Hong Wang	2024-09-03	下载	The convergence of digital twin technology and the emerging 6G network presents both challenges and numerous research opportunities. This article explores the potential synergies between digital twin ...
Designing Large Foundation Models for Efficient Training and Inference: A Survey	Dong Liu, Yanxuan Yu, Yite Wang, Jing Wu, Zhongwei Wan, Sina Alinejad, Benjamin Lengerich, Ying Nian Wu	2024-09-03	下载	This paper focuses on modern efficient training and inference technologies on foundation models and illustrates them from two perspectives: model and system design.
QPOPSS: Query and Parallelism Optimized Space-Saving for Finding Frequent Stream Elements	Victor Jarlow, Charalampos Stylianopoulos, Marina Papatriantafilou	2024-09-03	下载	The frequent elements problem, a key component in demanding stream-data analytics, involves selecting elements whose occurrence exceeds a user-specified threshold.
ACCESS-FL: Agile Communication and Computation for Efficient Secure Aggregation in Stable Federated Learning Networks	Niousha Nazemi, Omid Tavallaie, Shuaijun Chen, Anna Maria Mandalari, Kanchana Thilakarathna, Ralph Holz, Hamed Haddadi, Albert Y. Zomaya	2024-09-03	下载	Federated Learning (FL) is a promising distributed learning framework designed for privacy-aware applications. FL trains models on client devices without sharing the client's data and generates a glob...
Quantum Byzantine Agreement Against Full-information Adversary	Longcheng Li, Xiaoming Sun, Jiadong Zhu	2024-09-03	下载	We exhibit that, when given a classical Byzantine agreement protocol designed in the private-channel model, it is feasible to construct a quantum agreement protocol that can effectively handle a full-...
Priority based inter-twin communication in vehicular digital twin networks	Qasim Zia, Chenyu Wang, Saide Zhu, Yingshu Li	2024-09-03	下载	With the advancement and boom of autonomous vehicles, vehicular digital twins (VDTs) have become an emerging research area. VDT can solve the issues related to autonomous vehicles and provide improved...
Buffer-based Gradient Projection for Continual Federated Learning	Shenghong Dai, Jy-yong Sohn, Yicong Chen, S M Iftekharul Alam, Ravikumar Balakrishnan, Suman Banerjee, Nageen Himayat, Kangwook Lee	2024-09-03	下载	Continual Federated Learning (CFL) is essential for enabling real-world applications where multiple decentralized clients adaptively learn from continuous data streams.
A Unified, Practical, and Understandable Model of Non-transactional Consistency Levels in Distributed Replication	Guanzhou Hu, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau	2024-09-03	下载	We present a practical model of non-transactional consistency levels in the context of distributed data replication. Unlike prior work, our simple Shared Object Pool (SOP) model defines common consist...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Open6G OTIC: A Blueprint for Programmable O-RAN and 3GPP Testing Infrastructure	Gabriele Gemmi, Michele Polese, Pedram Johari, Stefano Maxenti, Michael Seltser, Tommaso Melodia	2024-09-03	下载	Softwarized and programmable Radio Access Networks (RANs) come with virtualized and disaggregated components, increasing the supply chain robustness and the flexibility and dynamism of the network dep...
Three Pillars Towards Next-Generation Routing System	Lei Li, Mengxuan Zhang, Zizhuo Xu, Yehong Xu, XIaofang Zhou	2024-09-03	下载	The routing results are playing an increasingly important role in transportation efficiency, but they could generate traffic congestion unintentionally.
When Digital Twin Meets 6G: Concepts, Obstacles, and Research Prospects	Wenshuai Liu, Yaru Fu, Zheng Shi, Hong Wang	2024-09-03	下载	The convergence of digital twin technology and the emerging 6G network presents both challenges and numerous research opportunities. This article explores the potential synergies between digital twin ...
Multi-Branch Attention Convolutional Neural Network for Online RIS Configuration with Discrete Responses: A Neuroevolution Approach	George Stamatelis, Kyriakos Stylianopoulos, George C. Alexandropoulos	2024-09-03	下载	In this paper, we consider the problem of jointly controlling the configuration of a Reconfigurable Intelligent Surface (RIS) with unit elements of discrete responses and a codebook-based transmit pre...
Low Layer Functional Split Management in 5G and Beyond: Architecture and Self-adaptation	Jordi Pérez-Romero, Oriol Sallent, David Campoy, Antoni Gelonch, Xavier Gelabert, Bleron Klaiqi	2024-09-03	下载	Radio Access Network (RAN) disaggregation is emerging as a key trend in beyond 5G, as it offers new opportunities for more flexible deployments and intelligent network management.
Priority based inter-twin communication in vehicular digital twin networks	Qasim Zia, Chenyu Wang, Saide Zhu, Yingshu Li	2024-09-03	下载	With the advancement and boom of autonomous vehicles, vehicular digital twins (VDTs) have become an emerging research area. VDT can solve the issues related to autonomous vehicles and provide improved...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Foreactor: Exploiting Storage I/O Parallelism with Explicit Speculation	Guanzhou Hu, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau	2024-09-03	下载	We introduce explicit speculation, a variant of I/O speculation technique where I/O system calls can be parallelized under the guidance of explicit application code knowledge.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Toward Capturing Genetic Epistasis From Multivariate Genome-Wide Association Studies Using Mixed-Precision Kernel Ridge Regression	Hatem Ltaief, Rabab Alomairy, Qinglei Cao, Jie Ren, Lotfi Slim, Thorsten Kurth, Benedikt Dorschner, Salim Bougouffa, Rached Abdelkhalak, David E. Keyes	2024-09-03	下载	We exploit the widening margin in tensor-core performance between [FP64/FP32/FP16/INT8,FP64/FP32/FP16/FP8/INT8] on NVIDIA [Ampere,Hopper] GPUs to boost the performance of output accuracy-preserving mi...