Skip to content

2024-08-19

cs.AR - Architecture

标题作者发布日期PDF摘要
System-Level Design Space Exploration for High-Level Synthesis under End-to-End Latency ConstraintsYuchao Liao, Tosiron Adegbija, Roman Lysecky2024-08-19下载Many modern embedded systems have end-to-end (EtoE) latency constraints that necessitate precise timing to ensure high reliability and functional correctness.
Are LLMs Any Good for High-Level Synthesis?Yuchao Liao, Tosiron Adegbija, Roman Lysecky2024-08-19下载The increasing complexity and demand for faster, energy-efficient hardware designs necessitate innovative High-Level Synthesis (HLS) methodologies.
Security Risks Due to Data Persistence in Cloud FPGA PlatformsZhehang Zhang, Bharadwaj Madabhushi, Sandip Kundu, Russell Tessier2024-08-19下载The integration of Field Programmable Gate Arrays (FPGAs) into cloud computing systems has become commonplace. As the operating systems used to manage these systems evolve, special consideration must ...
Tywaves: A Typed Waveform Viewer for ChiselRaffaele Meloni, H. Peter Hofstee, Zaid Al-Ars2024-08-19下载Chisel (Constructing Hardware In a Scala Embedded Language) is a broadly adopted HDL that brings object-oriented and functional programming, type-safety, and parameterization to hardware design.
Quantum Register Machine: Efficient Implementation of Quantum Recursive ProgramsZhicheng Zhang, Mingsheng Ying2024-08-19下载Quantum recursive programming has been recently introduced for describing sophisticated and complicated quantum algorithms in a compact and elegant way.
ShortCircuit: AlphaZero-Driven Circuit DesignDimitrios Tsaras, Antoine Grosnit, Lei Chen, Zhiyao Xie, Haitham Bou-Ammar, Mingxuan Yuan2024-08-19下载Chip design relies heavily on generating Boolean circuits, such as AND-Inverter Graphs (AIGs), from functional descriptions like truth tables.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Security Risks Due to Data Persistence in Cloud FPGA PlatformsZhehang Zhang, Bharadwaj Madabhushi, Sandip Kundu, Russell Tessier2024-08-19下载The integration of Field Programmable Gate Arrays (FPGAs) into cloud computing systems has become commonplace. As the operating systems used to manage these systems evolve, special consideration must ...
Demystifying the Communication Characteristics for Distributed Transformer ModelsQuentin Anthony, Benjamin Michalowicz, Jacob Hatef, Lang Xu, Mustafa Abduljabbar, Aamir Shafi, Hari Subramoni, Dhabaleswar Panda2024-08-19下载Deep learning (DL) models based on the transformer architecture have revolutionized many DL applications such as large language models (LLMs), vision transformers, audio generation, and time series pr...
Data-Driven Analysis to Understand GPU Hardware Resource Usage of OptimizationsTanzima Z. Islam, Aniruddha Marathe, Holland Schutte, Mohammad Zaeed2024-08-19下载With heterogeneous systems, the number of GPUs per chip increases to provide computational capabilities for solving science at a nanoscopic scale.
Selecting Relay Nodes Based on Evaluation Results to Enhance P2P Broadcasting EfficiencyChunlin Huang2024-08-19下载The existence of node failures is inevitable in distributed systems, thus many P2P broadcasting networks adopt highly robust Flooding-based broadcast algorithms.
Federated Frank-Wolfe AlgorithmAli Dadras, Sourasekhar Banerjee, Karthik Prakhya, Alp Yurtsever2024-08-19下载Federated learning (FL) has gained a lot of attention in recent years for building privacy-preserving collaborative learning systems. However, FL algorithms for constrained machine learning problems a...
The Expressive Power of Uniform Population Protocols with Logarithmic SpacePhilipp Czerner, Vincent Fischer, Roland Guttenberg2024-08-19下载Population protocols are a model of computation in which indistinguishable mobile agents interact in pairs to decide a property of their initial configuration. Originally introduced by Angluin et. al.
SSDTrain: An Activation Offloading Framework to SSDs for Faster Large Language Model TrainingKun Wu, Jeongmin Brian Park, Xiaofan Zhang, Mert Hidayetoğlu, Vikram Sharma Mailthody, Sitao Huang, Steven Sam Lumetta, Wen-mei Hwu2024-08-19下载The growth rate of the GPU memory capacity has not been able to keep up with that of the size of large language models (LLMs), hindering the model training process.
Gathering Semi-Synchronously Scheduled Two-State RobotsKohei Otaka, Fabian Frei, Koichi Wada2024-08-19下载We study the problem \emph{Gathering} for nn autonomous mobile robots in synchronous settings with a persistent memory called \emph{light}. It is well known that Gathering is impossible in the basic ...
Implementing OpenMP for Zig to enable its use in HPC contextDavid Kacs, Nick Brown, Joseph Lee2024-08-19下载This extended abstract explores supporting OpenMP in the Zig programming language. Whilst, C and Fortran are currently the main languages used to implement HPC applications, Zig provides a similar lev...
Work-Efficient Parallel Counting via SamplingHongyang Liu, Yitong Yin, Yiyao Zhang2024-08-19下载A canonical approach to approximating the partition function of a Gibbs distribution via sampling is simulated annealing. This method has led to efficient reductions from counting to sampling, includi...
Heta: Distributed Training of Heterogeneous Graph Neural NetworksYuchen Zhong, Junwei Su, Chuan Wu, Minjie Wang2024-08-19下载Heterogeneous Graph Neural Networks (HGNNs) leverage diverse semantic relationships in Heterogeneous Graphs (HetGs) and have demonstrated remarkable learning performance in various applications.
CusADi: A GPU Parallelization Framework for Symbolic Expressions and Optimal ControlSe Hwan Jeon, Seungwoo Hong, Ho Jae Lee, Charles Khazoom, Sangbae Kim2024-08-19下载The parallelism afforded by GPUs presents significant advantages in training controllers through reinforcement learning (RL). However, integrating model-based optimization into this process remains ch...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Real-Time Digital Twin Platform: A Case Study on Core Network Selection in Aeronautical Ad-Hoc NetworksLal Verda Cakir, Mihriban Kocak, Mehmet Özdem, Berk Canberk2024-08-19下载The development of Digital Twins (DTs) is hindered by a lack of specialized, open-source solutions that can meet the demands of dynamic applications.
Self-Play Ensemble Q-learning enabled Resource Allocation for Network SlicingShavbo Salehi, Pedro Enrique Iturria-Rivera, Medhat Elsayed, Majid Bavand, Raimundas Gaigalas, Yigit Ozcan, Melike Erol-Kantarci2024-08-19下载In 5G networks, network slicing has emerged as a pivotal paradigm to address diverse user demands and service requirements. To meet the requirements, reinforcement learning (RL) algorithms have been u...
LCDN: Providing Network Determinism with Low-Cost SwitchesPhilip Diederich, Yash Deshpande, Laura Becker, Wolfgang Kellerer2024-08-19下载The demands on networks are increasing at a fast pace. In particular, real-time applications have very strict network requirements. However, building a network that hosts real-time applications is a c...
Selecting Relay Nodes Based on Evaluation Results to Enhance P2P Broadcasting EfficiencyChunlin Huang2024-08-19下载The existence of node failures is inevitable in distributed systems, thus many P2P broadcasting networks adopt highly robust Flooding-based broadcast algorithms.
Validation of the Results of Cross-chain Smart Contract Based on Confirmation MethodHong Su2024-08-19下载Smart contracts are widely utilized in cross-chain interactions, where their results are transmitted from one blockchain (the producer blockchain) to another (the consumer blockchain).
ISAC-Fi: Enabling Full-fledged Monostatic Sensing over Wi-Fi CommunicationZhe Chen, Chao Hu, Tianyue Zheng, Hangcheng Cao, Yanbing Yang, Yen Chu, Hongbo Jiang, Jun Luo2024-08-19下载Whereas Wi-Fi communications have been exploited for sensing purpose for over a decade, the bistatic or multistatic nature of Wi-Fi still poses multiple challenges, hampering real-life deployment of i...
Global BGP Attacks that Evade Route MonitoringHenry Birge-Lee, Maria Apostolaki, Jennifer Rexford2024-08-19下载As the deployment of comprehensive Border Gateway Protocol (BGP) security measures is still in progress, BGP monitoring continues to play a critical role in protecting the Internet from routing attack...

cs.PF - Performance

标题作者发布日期PDF摘要
Data-Driven Analysis to Understand GPU Hardware Resource Usage of OptimizationsTanzima Z. Islam, Aniruddha Marathe, Holland Schutte, Mohammad Zaeed2024-08-19下载With heterogeneous systems, the number of GPUs per chip increases to provide computational capabilities for solving science at a nanoscopic scale.

基于 VitePress 构建