Appearance
2024-06-14
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| CMDS: Cross-layer Dataflow Optimization for DNN Accelerators Exploiting Multi-bank Memories | Man Shi, Steven Colleman, Charlotte VanDeMieroop, Antony Joseph, Maurice Meijer, Wim Dehaene, Marian Verhelst | 2024-06-14 | 下载 | Deep neural networks (DNN) use a wide range of network topologies to achieve high accuracy within diverse applications. This model diversity makes it impossible to identify a single "dataflow" (execut... |
| Optimising GPGPU Execution Through Runtime Micro-Architecture Parameter Analysis | Giuseppe M. Sarda, Nimish Shah, Debjyoti Bhattacharjee, Peter Debacker, Marian Verhelst | 2024-06-14 | 下载 | GPGPU execution analysis has always been tied to closed-source, proprietary benchmarking tools that provide high-level, non-exhaustive, and/or statistical information, preventing a thorough understand... |
| Optimizing Layer-Fused Scheduling of Transformer Networks on Multi-accelerator Platforms | Steven Colleman, Arne Symons, Victor J. B. Jung, Marian Verhelst | 2024-06-14 | 下载 | The impact of transformer networks is booming, yet, they come with significant computational complexity. It is therefore essential to understand how to optimally map and execute these networks on mode... |
| SAGA: Synthesis Augmentation with Genetic Algorithms for In-Memory Sequence Optimization | Andey Robins, Mike Borowczak | 2024-06-14 | 下载 | The von-Neumann architecture has a bottleneck which limits the speed at which data can be made available for computation. To combat this problem, novel paradigms for computing are being developed. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Byzantine-Robust Decentralized Federated Learning | Minghong Fang, Zifan Zhang, Hairi, Prashant Khanduri, Jia Liu, Songtao Lu, Yuchen Liu, Neil Gong | 2024-06-14 | 下载 | Federated learning (FL) enables multiple clients to collaboratively train machine learning models without revealing their private training data. |
| A Comparison of the Performance of the Molecular Dynamics Simulation Package GROMACS Implemented in the SYCL and CUDA Programming Models | L. Apanasevich, Yogesh Kale, Himanshu Sharma, Ana Marija Sokovic | 2024-06-14 | 下载 | For many years, systems running Nvidia-based GPU architectures have dominated the heterogeneous supercomputer landscape. However, recently GPU chipsets manufactured by Intel and AMD have cut into this... |
| Practical offloading for fine-tuning LLM on commodity GPU via learned sparse projectors | Siyuan Chen, Zhuofeng Wang, Zelong Guan, Yudong Liu, Phillip B. Gibbons | 2024-06-14 | 下载 | Fine-tuning large language models (LLMs) requires significant memory, often exceeding the capacity of a single GPU. A common solution to this memory challenge is offloading compute and data from the G... |
| Harnessing GPU Power for Enhanced OLTP: A Study in Concurrency Control Schemes | Zihan Sun, Yong Zhang, Chao Li, Chunxiao Xing | 2024-06-14 | 下载 | GPUs, whose performance has gone through a huge leap over the past decade, have proved their ability to accelerate Online Analytical Processing (OLAP) operations. |
| CMDS: Cross-layer Dataflow Optimization for DNN Accelerators Exploiting Multi-bank Memories | Man Shi, Steven Colleman, Charlotte VanDeMieroop, Antony Joseph, Maurice Meijer, Wim Dehaene, Marian Verhelst | 2024-06-14 | 下载 | Deep neural networks (DNN) use a wide range of network topologies to achieve high accuracy within diverse applications. This model diversity makes it impossible to identify a single "dataflow" (execut... |
| Federated Learning with Flexible Architectures | Jong-Ik Park, Carlee Joe-Wong | 2024-06-14 | 下载 | Traditional federated learning (FL) methods have limited support for clients with varying computational and communication abilities, leading to inefficiencies and potential inaccuracies in model train... |
| A Training-free Sub-quadratic Cost Transformer Model Serving Framework With Hierarchically Pruned Attention | Heejun Lee, Geon Park, Youngwan Lee, Jaduk Suh, Jina Kim, Wonyoung Jeong, Bumsik Kim, Hyemin Lee, Myeongjae Jeon, Sung Ju Hwang | 2024-06-14 | 下载 | In modern large language models (LLMs), increasing the context length is crucial for improving comprehension and coherence in long-context, multi-modal, and retrieval-augmented language generation. |
| Optimization policy for file replica placement in fog domains | Carlos Guerrero, Isaac Lera, Carlos Juiz | 2024-06-14 | 下载 | Fog computing architectures distribute computational and storage resources along the continuum from the cloud to things. Therefore, the execution of services or the storage of files can be closer to t... |
| PixRO: Pixel-Distributed Rotational Odometry with Gaussian Belief Propagation | Ignacio Alzugaray, Riku Murai, Andrew Davison | 2024-06-14 | 下载 | Images are the standard input for most computer vision algorithms. However, their processing often reduces to parallelizable operations applied locally and independently to individual pixels. |
| Speed-up of Data Analysis with Kernel Trick in Encrypted Domain | Joon Soo Yoo, Baek Kyung Song, Tae Min Ahn, Ji Won Heo, Ji Won Yoon | 2024-06-14 | 下载 | Homomorphic encryption (HE) is pivotal for secure computation on encrypted data, crucial in privacy-preserving data analysis. However, efficiently processing high-dimensional data in HE, especially fo... |
| Heterogeneous Federated Learning with Convolutional and Spiking Neural Networks | Yingchao Yu, Yuping Yan, Jisong Cai, Yaochu Jin | 2024-06-14 | 下载 | Federated learning (FL) has emerged as a promising paradigm for training models on decentralized data while safeguarding data privacy. Most existing FL systems, however, assume that all machine learni... |
| Cyberattack Data Analysis in IoT Environments using Big Data | Neelam Patidar, Sally Zreiqat, Sirisha Mahesh, Jongwook Woo | 2024-06-14 | 下载 | In the landscape of the Internet of Things (IoT), transforming various industries, our research addresses the growing connectivity and security challenges, including interoperability and standardized ... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A New Realistic Platform for Benchmarking and Performance Evaluation of DRL-Driven and Reconfigurable SFC Provisioning Solutions | Murat Arda Onsu, Poonam Lohan, Burak Kantarci, Emil Janulewicz, Sergio Slobodrian | 2024-06-14 | 下载 | Service Function Chain (SFC) provisioning stands as a pivotal technology in the realm of 5G and future networks. Its essence lies in orchestrating VNFs (Virtual Network Functions) in a specified seque... |
| A Near-Optimal Category Information Sampling in RFID Systems | Xiujun Wang, Zhi Liu, Xiaokang Zhou, Yong Liao, Han Hu, Xiao Zheng, Jie Li | 2024-06-14 | 下载 | In many RFID-enabled applications, objects are classified into different categories, and the information associated with each object's category (called category information) is written into the attach... |
| Efficient Mixed Integer Linear Programming Approaches to Dynamic Path Restoration | Alexander Rubtsov, Bruno Bauwens, Dmitri Shmelkin, Elizaveta Rudenko, Alexey Lavrov | 2024-06-14 | 下载 | We consider the problem of single link failure in an elastic optical network, (also known as flex-grid WDM network). The task is to reroute optical connections that go through the broken link using fr... |
| Intra-QLAN Connectivity: beyond the Physical Topology | Francesco Mazza, Marcello Caleffi, Angela Sara Cacciapuoti | 2024-06-14 | 下载 | In the near to mid future, Quantum Local Area Networks (QLANs) -- the fundamental building block of the Quantum Internet -- will unlike exhibit physical topologies characterized by densely physical co... |
| ARA-O-RAN: End-to-End Programmable O-RAN Living Lab for Agriculture and Rural Communities | Tianyi Zhang, Joshua Ofori Boateng, Taimoor UI Islam, Arsalan Ahmad, Hongwei Zhang, Daji Qiao | 2024-06-14 | 下载 | As wireless networks evolve towards open architectures like O-RAN, testing, and integration platforms are crucial to address challenges like interoperability. |
| Carbon-Aware End-to-End Data Movement | Jacob Goldverg, Hasibul Jamil, Elvis Rodriguez, Tevfik Kosar | 2024-06-14 | 下载 | The latest trends in the adoption of cloud, edge, and distributed computing, as well as a rise in applying AI/ML workloads, have created a need to measure, monitor, and reduce the carbon emissions of ... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| SquirrelFS: using the Rust compiler to check file-system crash consistency | Hayley LeBlanc, Nathan Taylor, James Bornholt, Vijay Chidambaram | 2024-06-14 | 下载 | This work introduces a new approach to building crash-safe file systems for persistent memory. We exploit the fact that Rust's typestate pattern allows compile-time enforcement of a specific order of ... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Comparison of the Performance of the Molecular Dynamics Simulation Package GROMACS Implemented in the SYCL and CUDA Programming Models | L. Apanasevich, Yogesh Kale, Himanshu Sharma, Ana Marija Sokovic | 2024-06-14 | 下载 | For many years, systems running Nvidia-based GPU architectures have dominated the heterogeneous supercomputer landscape. However, recently GPU chipsets manufactured by Intel and AMD have cut into this... |