Skip to content

2024-09-03

cs.AR - Architecture

标题作者发布日期PDF摘要
Global Optimizations & Lightweight Dynamic Logic for ConcurrencySuchita Pati, Shaizeen Aga, Nuwan Jayasena, Matthew D. Sinclair2024-09-03下载Modern accelerators like GPUs are increasingly executing independent operations concurrently to improve the device's compute utilization. However, effectively harnessing it on GPUs for important primi...
AIvril: AI-Driven RTL Generation With Verification In-The-LoopMubashir ul Islam, Humza Sami, Pierre-Emmanuel Gaillardon, Valerio Tenace2024-09-03下载Large Language Models (LLMs) are computational models capable of performing complex natural language processing tasks. Leveraging these capabilities, LLMs hold the potential to transform the entire ha...
Exploiting the Vulnerability of Large Language Models via Defense-Aware Architectural BackdoorAbdullah Arafat Miah, Yu Bi2024-09-03下载Deep neural networks (DNNs) have long been recognized as vulnerable to backdoor attacks. By providing poisoned training data in the fine-tuning process, the attacker can implant a backdoor into the vi...
The Impact of Run-Time Variability on Side-Channel Attacks Targeting FPGAsDavide Galli, Adriano Guarisco, William Fornaciari, Matteo Matteucci, Davide Zoni2024-09-03下载To defeat side-channel attacks, many recent countermeasures work by enforcing random run-time variability to the target computing platform in terms of clock jitters, frequency and voltage scaling, and...
Reuse and Blend: Energy-Efficient Optical Neural Network Enabled by Weight SharingBo Xu, Yuetong Fang, Shaoliang Yu, Renjing Xu2024-09-03下载Optical neural networks (ONN) based on micro-ring resonators (MRR) have emerged as a promising alternative to significantly accelerating the massive matrix-vector multiplication (MVM) operations in ar...
Toward Capturing Genetic Epistasis From Multivariate Genome-Wide Association Studies Using Mixed-Precision Kernel Ridge RegressionHatem Ltaief, Rabab Alomairy, Qinglei Cao, Jie Ren, Lotfi Slim, Thorsten Kurth, Benedikt Dorschner, Salim Bougouffa, Rached Abdelkhalak, David E. Keyes2024-09-03下载We exploit the widening margin in tensor-core performance between [FP64/FP32/FP16/INT8,FP64/FP32/FP16/FP8/INT8] on NVIDIA [Ampere,Hopper] GPUs to boost the performance of output accuracy-preserving mi...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Accelerating Fortran Codes: A Method for Integrating Coarray Fortran with CUDA Fortran and OpenMPJames McKevitt, Eduard I. Vorobyov, Igor Kulikov2024-09-03下载Fortran's prominence in scientific computing requires strategies to ensure both that legacy codes are efficient on high-performance computing systems, and that the language remains attractive for the ...
Global Optimizations & Lightweight Dynamic Logic for ConcurrencySuchita Pati, Shaizeen Aga, Nuwan Jayasena, Matthew D. Sinclair2024-09-03下载Modern accelerators like GPUs are increasingly executing independent operations concurrently to improve the device's compute utilization. However, effectively harnessing it on GPUs for important primi...
Quantifying Liveness and Safety of Avalanche's SnowballQuentin Kniep, Maxime Laval, Jakub Sliwinski, Roger Wattenhofer2024-09-03下载This work examines the resilience properties of the Snowball and Avalanche protocols that underlie the popular Avalanche blockchain. We experimentally quantify the resilience of Snowball using a simul...
Checkpoint and Restart: An Energy Consumption Characterization in ClustersMarina Moran, Javier Balladini, Dolores Rexachs, Emilio Luque2024-09-03下载The fault tolerance method currently used in High Performance Computing (HPC) is the rollback-recovery method by using checkpoints. This, like any other fault tolerance method, adds an additional ener...
Cache Coherence Over Disaggregated MemoryRuihong Wang, Jianguo Wang, Walid G. Aref2024-09-03下载Disaggregating memory from compute offers the opportunity to better utilize stranded memory in cloud data centers. It is important to cache data in the compute nodes and maintain cache coherence acros...
EcoLife: Carbon-Aware Serverless Function Scheduling for Sustainable ComputingYankai Jiang, Rohan Basu Roy, Baolin Li, Devesh Tiwari2024-09-03下载This work introduces ECOLIFE, the first carbon-aware serverless function scheduler to co-optimize carbon footprint and performance. ECOLIFE builds on the key insight of intelligently exploiting multi-...
FedMinds: Privacy-Preserving Personalized Brain Visual DecodingGuangyin Bao, Duoqian Miao2024-09-03下载Exploring the mysteries of the human brain is a long-term research topic in neuroscience. With the help of deep learning, decoding visual information from human brain activity fMRI has achieved promis...
When Digital Twin Meets 6G: Concepts, Obstacles, and Research ProspectsWenshuai Liu, Yaru Fu, Zheng Shi, Hong Wang2024-09-03下载The convergence of digital twin technology and the emerging 6G network presents both challenges and numerous research opportunities. This article explores the potential synergies between digital twin ...
Designing Large Foundation Models for Efficient Training and Inference: A SurveyDong Liu, Yanxuan Yu, Yite Wang, Jing Wu, Zhongwei Wan, Sina Alinejad, Benjamin Lengerich, Ying Nian Wu2024-09-03下载This paper focuses on modern efficient training and inference technologies on foundation models and illustrates them from two perspectives: model and system design.
QPOPSS: Query and Parallelism Optimized Space-Saving for Finding Frequent Stream ElementsVictor Jarlow, Charalampos Stylianopoulos, Marina Papatriantafilou2024-09-03下载The frequent elements problem, a key component in demanding stream-data analytics, involves selecting elements whose occurrence exceeds a user-specified threshold.
ACCESS-FL: Agile Communication and Computation for Efficient Secure Aggregation in Stable Federated Learning NetworksNiousha Nazemi, Omid Tavallaie, Shuaijun Chen, Anna Maria Mandalari, Kanchana Thilakarathna, Ralph Holz, Hamed Haddadi, Albert Y. Zomaya2024-09-03下载Federated Learning (FL) is a promising distributed learning framework designed for privacy-aware applications. FL trains models on client devices without sharing the client's data and generates a glob...
Quantum Byzantine Agreement Against Full-information AdversaryLongcheng Li, Xiaoming Sun, Jiadong Zhu2024-09-03下载We exhibit that, when given a classical Byzantine agreement protocol designed in the private-channel model, it is feasible to construct a quantum agreement protocol that can effectively handle a full-...
Priority based inter-twin communication in vehicular digital twin networksQasim Zia, Chenyu Wang, Saide Zhu, Yingshu Li2024-09-03下载With the advancement and boom of autonomous vehicles, vehicular digital twins (VDTs) have become an emerging research area. VDT can solve the issues related to autonomous vehicles and provide improved...
Buffer-based Gradient Projection for Continual Federated LearningShenghong Dai, Jy-yong Sohn, Yicong Chen, S M Iftekharul Alam, Ravikumar Balakrishnan, Suman Banerjee, Nageen Himayat, Kangwook Lee2024-09-03下载Continual Federated Learning (CFL) is essential for enabling real-world applications where multiple decentralized clients adaptively learn from continuous data streams.
A Unified, Practical, and Understandable Model of Non-transactional Consistency Levels in Distributed ReplicationGuanzhou Hu, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau2024-09-03下载We present a practical model of non-transactional consistency levels in the context of distributed data replication. Unlike prior work, our simple Shared Object Pool (SOP) model defines common consist...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Open6G OTIC: A Blueprint for Programmable O-RAN and 3GPP Testing InfrastructureGabriele Gemmi, Michele Polese, Pedram Johari, Stefano Maxenti, Michael Seltser, Tommaso Melodia2024-09-03下载Softwarized and programmable Radio Access Networks (RANs) come with virtualized and disaggregated components, increasing the supply chain robustness and the flexibility and dynamism of the network dep...
Three Pillars Towards Next-Generation Routing SystemLei Li, Mengxuan Zhang, Zizhuo Xu, Yehong Xu, XIaofang Zhou2024-09-03下载The routing results are playing an increasingly important role in transportation efficiency, but they could generate traffic congestion unintentionally.
When Digital Twin Meets 6G: Concepts, Obstacles, and Research ProspectsWenshuai Liu, Yaru Fu, Zheng Shi, Hong Wang2024-09-03下载The convergence of digital twin technology and the emerging 6G network presents both challenges and numerous research opportunities. This article explores the potential synergies between digital twin ...
Multi-Branch Attention Convolutional Neural Network for Online RIS Configuration with Discrete Responses: A Neuroevolution ApproachGeorge Stamatelis, Kyriakos Stylianopoulos, George C. Alexandropoulos2024-09-03下载In this paper, we consider the problem of jointly controlling the configuration of a Reconfigurable Intelligent Surface (RIS) with unit elements of discrete responses and a codebook-based transmit pre...
Low Layer Functional Split Management in 5G and Beyond: Architecture and Self-adaptationJordi Pérez-Romero, Oriol Sallent, David Campoy, Antoni Gelonch, Xavier Gelabert, Bleron Klaiqi2024-09-03下载Radio Access Network (RAN) disaggregation is emerging as a key trend in beyond 5G, as it offers new opportunities for more flexible deployments and intelligent network management.
Priority based inter-twin communication in vehicular digital twin networksQasim Zia, Chenyu Wang, Saide Zhu, Yingshu Li2024-09-03下载With the advancement and boom of autonomous vehicles, vehicular digital twins (VDTs) have become an emerging research area. VDT can solve the issues related to autonomous vehicles and provide improved...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Foreactor: Exploiting Storage I/O Parallelism with Explicit SpeculationGuanzhou Hu, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau2024-09-03下载We introduce explicit speculation, a variant of I/O speculation technique where I/O system calls can be parallelized under the guidance of explicit application code knowledge.

cs.PF - Performance

标题作者发布日期PDF摘要
Toward Capturing Genetic Epistasis From Multivariate Genome-Wide Association Studies Using Mixed-Precision Kernel Ridge RegressionHatem Ltaief, Rabab Alomairy, Qinglei Cao, Jie Ren, Lotfi Slim, Thorsten Kurth, Benedikt Dorschner, Salim Bougouffa, Rached Abdelkhalak, David E. Keyes2024-09-03下载We exploit the widening margin in tensor-core performance between [FP64/FP32/FP16/INT8,FP64/FP32/FP16/FP8/INT8] on NVIDIA [Ampere,Hopper] GPUs to boost the performance of output accuracy-preserving mi...

基于 VitePress 构建