Skip to content

2024-01-22

cs.AR - Architecture

标题作者发布日期PDF摘要
ACS: Concurrent Kernel Execution on Irregular, Input-Dependent Computational GraphsSankeerth Durvasula, Adrian Zhao, Raymond Kiguru, Yushi Guan, Zhonghan Chen, Nandita Vijaykumar2024-01-22下载GPUs are widely used to accelerate many important classes of workloads today. However, we observe that several important emerging classes of workloads, including simulation engines for deep reinforcem...
Retrieval-Guided Reinforcement Learning for Boolean Circuit MinimizationAnimesh Basak Chowdhury, Marco Romanelli, Benjamin Tan, Ramesh Karri, Siddharth Garg2024-01-22下载Logic synthesis, a pivotal stage in chip design, entails optimizing chip specifications encoded in hardware description languages like Verilog into highly efficient implementations using Boolean logic...
An Irredundant and Compressed Data Layout to Optimize Bandwidth Utilization of FPGA AcceleratorsCorentin Ferry, Nicolas Derumigny, Steven Derrien, Sanjay Rajopadhye2024-01-22下载Memory bandwidth is known to be a performance bottleneck for FPGA accelerators, especially when they deal with large multi-dimensional data-sets.
BETA: Binarized Energy-Efficient Transformer Accelerator at the EdgeYuhao Ji, Chao Fang, Zhongfeng Wang2024-01-22下载Existing binary Transformers are promising in edge deployment due to their compact model size, low computational complexity, and considerable inference accuracy.
Accelerating Seed Location Filtering in DNA Read Mapping Using a Commercial Compute-in-SRAM ArchitectureCourtney Golden, Dan Ilan, Nicholas Cebry, Christopher Batten2024-01-22下载DNA sequence alignment is an important workload in computational genomics. Reference-guided DNA assembly involves aligning many read sequences against candidate locations in a long reference genome.
Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAMBingbing Li, Geng Yuan, Zigeng Wang, Shaoyi Huang, Hongwu Peng, Payman Behnam, Wujie Wen, Hang Liu, Caiwen Ding2024-01-22下载Resistive Random Access Memory (ReRAM) has emerged as a promising platform for deep neural networks (DNNs) due to its support for parallel in-situ matrix-vector multiplication.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Learning Recovery Strategies for Dynamic Self-healing in Reactive SystemsMateo Sanabria, Ivana Dusparic, Nicolas Cardozo2024-01-22下载Self-healing systems depend on following a set of predefined instructions to recover from a known failure state. Failure states are generally detected based on domain specific specialized metrics.
Efficient Collaborations through Weight-Driven Coalition Dynamics in Federated Learning SystemsMohammed El Hanjri, Hamza Reguieg, Adil Attiaoui, Amine Abouaomar, Abdellatif Kobbane, Mohamed El Kamili2024-01-22下载In the era of the Internet of Things (IoT), decentralized paradigms for machine learning are gaining prominence. In this paper, we introduce a federated learning model that capitalizes on the Euclidea...
Uncoded Storage Coded Transmission Elastic Computing with Straggler Tolerance in Heterogeneous SystemsXi Zhong, Joerg Kliewer, Mingyue Ji2024-01-22下载In 2018, Yang et al. introduced a novel and effective approach, using maximum distance separable (MDS) codes, to mitigate the impact of elasticity in cloud computing systems.
Centralization in Block Building and Proposer-Builder SeparationMaryam Bahrani, Pranav Garimidi, Tim Roughgarden2024-01-22下载The goal of this paper is to rigorously interrogate conventional wisdom about centralization in block-building (due to, e.g., MEV and private order flow) and the outsourcing of block-building by valid...
Alya towards Exascale: Optimal OpenACC Performance of the Navier-Stokes Finite Element Assembly on GPUsHerbert Owen, Dominik Ernst, Thomas Gruber, Oriol Lemkuhl, Guillaume Houzeaux, Lucas Gasparino, Gerhard Wellein2024-01-22下载This paper addresses the challenge of providing portable and highly efficient code structures for CPU and GPU architectures. We choose the assembly of the right-hand term in the incompressible flow mo...
LLM-based policy generation for intent-based management of applicationsKristina Dzeparoska, Jieyu Lin, Ali Tizghadam, Alberto Leon-Garcia2024-01-22下载Automated management requires decomposing high-level user requests, such as intents, to an abstraction that the system can understand and execute.
TurboSVM-FL: Boosting Federated Learning through SVM Aggregation for Lazy ClientsMengdi Wang, Anna Bodonhelyi, Efe Bozkir, Enkelejda Kasneci2024-01-22下载Federated learning is a distributed collaborative machine learning paradigm that has gained strong momentum in recent years. In federated learning, a central server periodically coordinates models wit...
Tight Bounds on the Message Complexity of Distributed Tree VerificationShay Kutten, Peter Robinson, Ming Ming Tan2024-01-22下载We consider the message complexity of verifying whether a given subgraph of the communication network forms a tree with specific properties both in the KT-ρ (nodes know their ρ-hop neighborhood, i...
Accelerating Causal Algorithms for Industrial-scale Data: A Distributed Computing Approach with Ray FrameworkVishal Verma, Vinod Reddy, Jaiprakash Ravi2024-01-22下载The increasing need for causal analysis in large-scale industrial datasets necessitates the development of efficient and scalable causal algorithms for real-world applications.
Navigating the Maize: Cyclic and conditional computational graphs for molecular simulationThomas Löhr, Michele Assante, Michael Dodds, Lili Cao, Mikhail Kabeshov, Jon-Paul Janet, Marco Klähn, Ola Engkvist2024-01-22下载Many computational chemistry and molecular simulation workflows can be expressed as graphs. This abstraction is useful to modularize and potentially reuse existing components, as well as provide paral...
Self-Balancing Semi-Hierarchical PCNs for CBDCsMarco Benedetti, Francesco De Sclavis, Marco Favorito, Giuseppe Galano, Sara Giammusso, Antonio Muci, Matteo Nardelli2024-01-22下载We introduce a family of PCNs (Payment Channel Networks) characterized by a semi-hierarchical topology and a custom set of channel rebalancing strategies.
Integrated Sensing, Communication, and Computing: An Information-oriented Resource Transaction MechanismNing Chen, Zhipeng Cheng, Xuwei Fan, Zhang Liu, Bangzhen Huang, Jie Yang, Yifeng Zhao, Lianfen Huang2024-01-22下载Information acquisition from target perception represents the key enabling technology of the Internet of Automatic Vehicles (IoAV), which is essential for the decision-making and control operation of ...
Transformers with Attentive Federated Aggregation for Time Series Stock ForecastingChu Myaet Thwal, Ye Lin Tun, Kitae Kim, Seong-Bae Park, Choong Seon Hong2024-01-22下载Recent innovations in transformers have shown their superior performance in natural language processing (NLP) and computer vision (CV). The ability to capture long-range dependencies and interactions ...
Attention on Personalized Clinical Decision Support System: Federated Learning ApproachChu Myaet Thwal, Kyi Thar, Ye Lin Tun, Choong Seon Hong2024-01-22下载Health management has become a primary problem as new kinds of diseases and complex symptoms are introduced to a rapidly growing modern society.
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learningChu Myaet Thwal, Minh N. H. Nguyen, Ye Lin Tun, Seong Tae Kim, My T. Thai, Choong Seon Hong2024-01-22下载Federated learning (FL) has emerged as a promising approach to collaboratively train machine learning models across multiple edge devices while preserving privacy.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Data-oriented Coordinated Uplink Transmission for Massive IoT SystemJyri Hämäläinen, Rui Dinis, Mehmet C. Ilter2024-01-22下载Recently, the paradigm of massive ultra-reliable low-latency IoT communications (URLLC-IoT) has gained growing interest. Reliable delay-critical uplink transmission in IoT is a challenging task since ...
Fast and Scalable Network Slicing by Integrating Deep Learning with Lagrangian MethodsTianlun Hu, Qi Liao, Qiang Liu, Antonio Massaro, Georg Carle2024-01-22下载Network slicing is a key technique in 5G and beyond for efficiently supporting diverse services. Many network slicing solutions rely on deep learning to manage complex and high-dimensional resource al...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
SyzRetrospector: A Large-Scale Retrospective Study of SyzbotJoseph Bursey, Ardalan Amiri Sani, Zhiyun Qian2024-01-22下载Over the past 6 years, Syzbot has fuzzed the Linux kernel day and night to report over 5570 bugs, of which 4604 have been patched [11]. While this is impressive, we have found the average time to find...

cs.PF - Performance

标题作者发布日期PDF摘要
Alya towards Exascale: Optimal OpenACC Performance of the Navier-Stokes Finite Element Assembly on GPUsHerbert Owen, Dominik Ernst, Thomas Gruber, Oriol Lemkuhl, Guillaume Houzeaux, Lucas Gasparino, Gerhard Wellein2024-01-22下载This paper addresses the challenge of providing portable and highly efficient code structures for CPU and GPU architectures. We choose the assembly of the right-hand term in the incompressible flow mo...
Systematic Performance Evaluation Framework for LEO Mega-Constellation Satellite NetworksYu Wang, Chuili Kong, Xian Meng, Hejia Luo, Ke-Xin Li, Jun Wang2024-01-22下载Low Earth orbit (LEO) mega-constellation satellite networks have shown great potential to extend the coverage capability of conventional terrestrial networks.

基于 VitePress 构建