Appearance
2024-04-17
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Understanding the Performance Horizon of the Latest ML Workloads with NonGEMM Workloads | Rachid Karami, Sheng-Chun Kao, Hyoukjun Kwon | 2024-04-17 | 下载 | Among ML operators today, GEneralMatrix Multiplication (GEMM)-based operators are known to be key operators that build the main backbone of ML models. |
| Functionality Locality, Mixture & Control = Logic = Memory | Xiangjun Peng | 2024-04-17 | 下载 | This work provides new insights and constructs to the field of computer architecture and systems, and these insights are expected to be useful for the broad software stack. |
| Real Time Evolvable Hardware for Optimal Reconfiguration of Cusp-Like Pulse Shapers | Juan Lanchares, Oscar Garnica, José L. Risco-Martín, J. Ignacio Hidalgo, J. Manuel Colmenar, Alfredo Cuesta | 2024-04-17 | 下载 | The design of a cusp-like digital pulse shaper for particle energy measurements requires the definition of four parameters whose values are defined based on the nature of the shaper input signal (timi... |
| Revisiting Main Memory-Based Covert and Side Channel Attacks in the Context of Processing-in-Memory | F. Nisa Bostanci, Konstantinos Kanellopoulos, Ataberk Olgun, A. Giray Yaglikci, Ismail Emir Yuksel, Nika Mansouri Ghiasi, Zulal Bingol, Mohammad Sadrosadati, Onur Mutlu | 2024-04-17 | 下载 | We introduce IMPACT, a set of high-throughput main memory-based timing attacks that leverage characteristics of processing-in-memory (PiM) architectures to establish covert and side channels. |
| Efficient Approaches for GEMM Acceleration on Leading AI-Optimized FPGAs | Endri Taka, Dimitrios Gourounas, Andreas Gerstlauer, Diana Marculescu, Aman Arora | 2024-04-17 | 下载 | FPGAs are a promising platform for accelerating Deep Learning (DL) applications, due to their high performance, low power consumption, and reconfigurability. |
| Asynchronous Memory Access Unit: Exploiting Massive Parallelism for Far Memory Access | Luming Wang, Xu Zhang, Songyue Wang, Zhuolun Jiang, Tianyue Lu, Mingyu Chen, Siwei Luo, Keji Huang | 2024-04-17 | 下载 | The growing memory demands of modern applications have driven the adoption of far memory technologies in data centers to provide cost-effective, high-capacity memory solutions. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Dual-pronged deep learning preprocessing on heterogeneous platforms with CPU, Accelerator and CSD | Jia Wei, Xingjun Zhang, Witold Pedrycz, Longxiang Wang, Jie Zhao | 2024-04-17 | 下载 | For image-related deep learning tasks, the first step often involves reading data from external storage and performing preprocessing on the CPU. |
| Simulating Cloud Environments of Connected Vehicles for Anomaly Detection | M. Weiß, J. Stümpfle, F. Dettinger, N. Jazdi, M. Weyrich | 2024-04-17 | 下载 | The emergence of connected vehicles is driven by increasing customer and regulatory demands. To meet these, more complex software applications, some of which require service-based cloud and edge backe... |
| Araucaria: Simplifying INC Fault Tolerance with High-Level Intents | Ricardo Parizotto, Israat Haque, Alberto Schaeffer-Filho | 2024-04-17 | 下载 | Network programmability allows modification of fine-grain data plane functionality. The performance benefits of data plane programmability have motivated many researchers to offload computation that p... |
| A Secure and Trustworthy Network Architecture for Federated Learning Healthcare Applications | Antonio Boiano, Marco Di Gennaro, Luca Barbieri, Michele Carminati, Monica Nicoli, Alessandro Redondi, Stefano Savazzi, Albert Sund Aillet, Diogo Reis Santos, Luigi Serio | 2024-04-17 | 下载 | Federated Learning (FL) has emerged as a promising approach for privacy-preserving machine learning, particularly in sensitive domains such as healthcare. |
| FLeeC: a Fast Lock-Free Application Cache | André J. Costa, Nuno M. Preguiça, João M. Lourenço | 2024-04-17 | 下载 | When compared to blocking concurrency, non-blocking concurrency can provide higher performance in parallel shared-memory contexts, especially in high contention scenarios. |
| Hierarchical storage management in user space for neuroimaging applications | Valérie Hayot-Sasson, Tristan Glatard | 2024-04-17 | 下载 | Neuroimaging open-data initiatives have led to increased availability of large scientific datasets. While these datasets are shifting the processing bottleneck from compute-intensive to data-intensive... |
| IoTSim-Osmosis-RES: Towards autonomic renewable energy-aware osmotic computing | Tomasz Szydlo, Amadeusz Szabala, Nazar Kordiumov, Konrad Siuzdak, Lukasz Wolski, Khaled Alwasel, Fawzy Habeeb, Rajiv Ranjan | 2024-04-17 | 下载 | Internet of Things systems exists in various areas of our everyday life. For example, sensors installed in smart cities and homes are processed in edge and cloud computing centres providing several be... |
| Quantum Cloud Computing: A Review, Open Problems, and Future Directions | Hoa T. Nguyen, Prabhakar Krishnan, Dilip Krishnaswamy, Muhammad Usman, Rajkumar Buyya | 2024-04-17 | 下载 | Quantum cloud computing is an emerging paradigm of computing that empowers quantum applications and their deployment on quantum computing resources without the need for a specialized environment to ho... |
| Distributed Fractional Bayesian Learning for Adaptive Optimization | Yaqun Yang, Jinlong Lei, Guanghui Wen, Yiguang Hong | 2024-04-17 | 下载 | This paper considers a distributed adaptive optimization problem, where all agents only have access to their local cost functions with a common unknown parameter, whereas they mean to collaboratively ... |
| Accelerating Geo-distributed Machine Learning with Network-Aware Adaptive Tree and Auxiliary Route | Zonghang Li, Wenjiao Feng, Weibo Cai, Hongfang Yu, Long Luo, Gang Sun, Hongyang Du, Dusit Niyato | 2024-04-17 | 下载 | Distributed machine learning is becoming increasingly popular for geo-distributed data analytics, facilitating the collaborative analysis of data scattered across data centers in different regions. |
| Undo and Redo Support for Replicated Registers | Leo Stewen, Martin Kleppmann | 2024-04-17 | 下载 | Undo and redo functionality is ubiquitous in collaboration software. In single user settings, undo and redo are well understood. However, when multiple users edit a document, concurrency may arise, le... |
| Mutiny! How does Kubernetes fail, and what can we do about it? | Marco Barletta, Marcello Cinque, Catello Di Martino, Zbigniew T. Kalbarczyk, Ravishankar K. Iyer | 2024-04-17 | 下载 | In this paper, we i) analyze and classify real-world failures of Kubernetes (the most popular container orchestration system), ii) develop a framework to perform a fault/error injection campaign targe... |
| XMiner: Efficient Directed Subgraph Matching with Pattern Reduction | Pingpeng Yuan, Yujiang Wang, Tianyu Ma, Siyuan He, Ling Liu | 2024-04-17 | 下载 | Graph pattern matching, one of the fundamental graph mining problems, aims to extract structural patterns of interest from an input graph. The state-of-the-art graph matching algorithms and systems ar... |
| ScaleFold: Reducing AlphaFold Initial Training Time to 10 Hours | Feiwen Zhu, Arkadiusz Nowaczynski, Rundong Li, Jie Xin, Yifei Song, Michal Marcinkiewicz, Sukru Burc Eryilmaz, Jun Yang, Michael Andersch | 2024-04-17 | 下载 | AlphaFold2 has been hailed as a breakthrough in protein folding. It can rapidly predict protein structures with lab-grade accuracy. However, its implementation does not include the necessary training ... |
| Approximate Wireless Communication for Lossy Gradient Updates in IoT Federated Learning | Xiang Ma, Haijian Sun, Rose Qingyang Hu, Yi Qian | 2024-04-17 | 下载 | Federated learning (FL) has emerged as a distributed machine learning (ML) technique that can protect local data privacy for participating clients and improve system efficiency. |
| FedFa: A Fully Asynchronous Training Paradigm for Federated Learning | Haotian Xu, Zhaorui Zhang, Sheng Di, Benben Liu, Khalid Ayed Alharthi, Jiannong Cao | 2024-04-17 | 下载 | Federated learning has been identified as an efficient decentralized training paradigm for scaling the machine learning model training on a large number of devices while guaranteeing the data privacy ... |
| A Preliminary Study on Accelerating Simulation Optimization with GPU Implementation | Jinghai He, Haoyu Liu, Yuhang Wu, Zeyu Zheng, Tingyu Zhu | 2024-04-17 | 下载 | We provide a preliminary study on utilizing GPU (Graphics Processing Unit) to accelerate computation for three simulation optimization tasks with either first-order or second-order algorithms. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Araucaria: Simplifying INC Fault Tolerance with High-Level Intents | Ricardo Parizotto, Israat Haque, Alberto Schaeffer-Filho | 2024-04-17 | 下载 | Network programmability allows modification of fine-grain data plane functionality. The performance benefits of data plane programmability have motivated many researchers to offload computation that p... |
| SERENE: A Collusion Resilient Replication-based Verification Framework | Amir Esmaeili, Abderrahmen Mtibaa | 2024-04-17 | 下载 | The rapid advancement of autonomous driving technology is accompanied by substantial challenges, particularly the reliance on remote task execution without ensuring a reliable and accurate returned re... |
| What-if Analysis Framework for Digital Twins in 6G Wireless Network Management | Elif Ak, Berk Canberk, Vishal Sharma, Octavia A. Dobre, Trung Q. Duong | 2024-04-17 | 下载 | This study explores implementing a digital twin network (DTN) for efficient 6G wireless network management, aligning with the fault, configuration, accounting, performance, and security (FCAPS) model. |
| Enhancing Data Privacy In Wireless Sensor Networks: Investigating Techniques And Protocols To Protect Privacy Of Data Transmitted Over Wireless Sensor Networks In Critical Applications Of Healthcare And National Security | Akinsola Ahmed, Ejiofor Oluomachi, Akinde Abdullah, Njoku Tochukwu | 2024-04-17 | 下载 | The article discusses the emergence of Wireless Sensor Networks (WSNs) as a groundbreaking technology in data processing and communication. It outlines how WSNs, composed of dispersed autonomous senso... |
| Image Generative Semantic Communication with Multi-Modal Similarity Estimation for Resource-Limited Networks | Eri Hosonuma, Taku Yamazaki, Takumi Miyoshi, Akihito Taya, Yuuki Nishiyama, Kaoru Sezaki | 2024-04-17 | 下载 | To reduce network traffic and support environments with limited resources, a method for transmitting images with minimal transmission data is required. |
| On the Performance of RIS-assisted Networks with HQAM | Thrassos K. Oikonomou, Dimitrios Tyrovolas, Sotiris A. Tegos, Panagiotis D. Diamantoulakis, Panagiotis Sarigiannidis, Christos Liaskos, George K. Karagiannidis | 2024-04-17 | 下载 | In this paper, we investigate the application of hexagonal quadrature amplitude modulation (HQAM) in reconfigurable intelligent surface (RIS)-assisted networks, specifically focusing on its efficiency... |
| Approximate Wireless Communication for Lossy Gradient Updates in IoT Federated Learning | Xiang Ma, Haijian Sun, Rose Qingyang Hu, Yi Qian | 2024-04-17 | 下载 | Federated learning (FL) has emerged as a distributed machine learning (ML) technique that can protect local data privacy for participating clients and improve system efficiency. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Understanding the Performance Horizon of the Latest ML Workloads with NonGEMM Workloads | Rachid Karami, Sheng-Chun Kao, Hyoukjun Kwon | 2024-04-17 | 下载 | Among ML operators today, GEneralMatrix Multiplication (GEMM)-based operators are known to be key operators that build the main backbone of ML models. |