Appearance
2024-08-21
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Floating-Point Multiply-Add with Approximate Normalization for Low-Cost Matrix Engines | Kosmas Alexandridis, Christodoulos Peltekis, Dionysios Filippas, Giorgos Dimitrakopoulos | 2024-08-21 | 下载 | The widespread adoption of machine learning algorithms necessitates hardware acceleration to ensure efficient performance. This acceleration relies on custom matrix engines that operate on full or red... |
| Anteumbler: Non-Invasive Antenna Orientation Error Measurement for WiFi APs | Dawei Yan, Panlong Yang, Fei Shang, Nikolaos M. Freris, Yubo Yan | 2024-08-21 | 下载 | The performance of WiFi-based localization systems is affected by the spatial accuracy of WiFi AP. Compared with the imprecision of AP location and antenna separation, the imprecision of AP's or anten... |
| Confidential Computing on Heterogeneous CPU-GPU Systems: Survey and Future Directions | Qifan Wang, David Oswald | 2024-08-21 | 下载 | In recent years, the widespread informatization and rapid data explosion have increased the demand for high-performance heterogeneous systems that integrate multiple computing cores such as CPUs, Grap... |
| In-Memory Computing Architecture for Efficient Hardware Security | Hala Ajmi, Fakhreddine Zayer, Hamdi Belgacem | 2024-08-21 | 下载 | This paper presents an innovative approach utilizing in-memory computing (IMC) for the development and integration of AES (Advanced Encryption Standard) cipher technique. |
| HiMA: Hierarchical Quantum Microarchitecture for Qubit-Scaling and Quantum Process-Level Parallelism | Qi Zhou, Zi-Hao Mei, Han-Qing Shi, Liang-Liang Guo, Xiao-Yan Yang, Yun-Jie Wang, Xiao-Fan Xu, Cheng Xue, Wei-Cheng Kong, Jun-Chao Wang, Yu-Chun Wu, Zhao-Yun Chen, Guo-Ping Guo | 2024-08-21 | 下载 | Quantum computing holds immense potential for addressing a myriad of intricate challenges, which is significantly amplified when scaled to thousands of qubits. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Distributed-Memory Parallel Algorithms for Sparse Matrix and Sparse Tall-and-Skinny Matrix Multiplication | Isuru Ranawaka, Md Taufique Hussain, Charles Block, Gerasimos Gerogiannis, Josep Torrellas, Ariful Azad | 2024-08-21 | 下载 | We consider a sparse matrix-matrix multiplication (SpGEMM) setting where one matrix is square and the other is tall and skinny. This special variant, called TS-SpGEMM, has important applications in mu... |
| HoSZp: An Efficient Homomorphic Error-bounded Lossy Compressor for Scientific Data | Tripti Agarwal, Sheng Di, Jiajun Huang, Yafan Huang, Ganesh Gopalakrishnan, Robert Underwood, Kai Zhao, Xin Liang, Guanpeng Li, Franck Cappello | 2024-08-21 | 下载 | Error-bounded lossy compression has been a critical technique to significantly reduce the sheer amounts of simulation datasets for high-performance computing (HPC) scientific applications while effect... |
| PAL: A Variability-Aware Policy for Scheduling ML Workloads in GPU Clusters | Rutwik Jain, Brandon Tran, Keting Chen, Matthew D. Sinclair, Shivaram Venkataraman | 2024-08-21 | 下载 | Large-scale computing systems are increasingly using accelerators such as GPUs to enable peta- and exa-scale levels of compute to meet the needs of Machine Learning (ML) and scientific computing appli... |
| Cost-Effective Big Data Orchestration Using Dagster: A Multi-Platform Approach | Hernan Picatto, Georg Heiler, Peter Klimek | 2024-08-21 | 下载 | The rapid advancement of big data technologies has underscored the need for robust and efficient data processing solutions. Traditional Spark-based Platform-as-a-Service (PaaS) solutions, such as Data... |
| Understanding Data Movement in Tightly Coupled Heterogeneous Systems: A Case Study with the Grace Hopper Superchip | Luigi Fusco, Mikhail Khalilov, Marcin Chrapek, Giridhar Chukkapalli, Thomas Schulthess, Torsten Hoefler | 2024-08-21 | 下载 | Heterogeneous supercomputers have become the standard in HPC. GPUs in particular have dominated the accelerator landscape, offering unprecedented performance in parallel workloads and unlocking new po... |
| High Performance Unstructured SpMM Computation Using Tensor Cores | Patrik Okanovic, Grzegorz Kwasniewski, Paolo Sylos Labini, Maciej Besta, Flavio Vella, Torsten Hoefler | 2024-08-21 | 下载 | High-performance sparse matrix-matrix (SpMM) multiplication is paramount for science and industry, as the ever-increasing sizes of data prohibit using dense data structures. |
| Stream-K++: Adaptive GPU GEMM Kernel Scheduling and Selection using Bloom Filters | Harisankar Sadasivan, Muhammed Emin Ozturk, Muhammad Osama, Chris Millette, Astha Rai, Maksim Podkorytov, John Afaganis, Carlus Huang, Jing Zhang, Jun Liu | 2024-08-21 | 下载 | General matrix multiplication (GEMM) operations are the fundamental building blocks of computational domains including artificial intelligence (AI). |
| Distributed Path Compression for Piecewise Linear Morse-Smale Segmentations and Connected Components | Michael Will, Jonas Lukasczyk, Julien Tierny, Christoph Garth | 2024-08-21 | 下载 | This paper describes the adaptation of a well-scaling parallel algorithm for computing Morse-Smale segmentations based on path compression to a distributed computational setting. |
| Telepathic Datacenters: Fast RPCs using Shared CXL Memory | Suyash Mahar, Ehsan Hajyjasini, Seungjin Lee, Zifeng Zhang, Mingyao Shen, Steven Swanson | 2024-08-21 | 下载 | Datacenter applications often rely on remote procedure calls (RPCs) for fast, efficient, and secure communication. However, RPCs are slow, inefficient, and hard to use as they require expensive serial... |
| Softening the Impact of Collisions in Contention Resolution | Umesh Biswas, Trisha Chakraborty, Maxwell Young | 2024-08-21 | 下载 | Contention resolution addresses the problem of coordinating access to a shared communication channel. Time is discretized into synchronized slots, and a packet can be sent in any slot. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Simulators for Quantum Network Modelling: A Comprehensive Review | Oceane Bel, Mariam Kiran | 2024-08-21 | 下载 | Quantum network research, is exploring new networking protocols, physics-based hardware and novel experiments to demonstrate how quantum distribution will work over large distances. |
| Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards | Omar Erak, Nouf Alabbasi, Omar Alhussein, Ismail Lotfi, Amr Hussein, Sami Muhaidat, Merouane Debbah | 2024-08-21 | 下载 | Recent studies show that large language models (LLMs) struggle with technical standards in telecommunications. We propose a fine-tuned retrieval-augmented generation (RAG) system based on the Phi-2 sm... |
| Anteumbler: Non-Invasive Antenna Orientation Error Measurement for WiFi APs | Dawei Yan, Panlong Yang, Fei Shang, Nikolaos M. Freris, Yubo Yan | 2024-08-21 | 下载 | The performance of WiFi-based localization systems is affected by the spatial accuracy of WiFi AP. Compared with the imprecision of AP location and antenna separation, the imprecision of AP's or anten... |
| 5G NR PRACH Detection with Convolutional Neural Networks (CNN): Overcoming Cell Interference Challenges | Desire Guel, Arsene Kabore, Didier Bassole | 2024-08-21 | 下载 | In this paper, we present a novel approach to interference detection in 5G New Radio (5G-NR) networks using Convolutional Neural Networks (CNN). |
| Optimizing QoS in HD Map Updates: Cross-Layer Multi-Agent with Hierarchical and Independent Learning | Jeffrey Redondo, Nauman Aslam, Juan Zhang, Zhenhui Yuan | 2024-08-21 | 下载 | The data collected by autonomous vehicle (AV) sensors such as LiDAR and cameras is crucial for creating high-definition (HD) maps to provide higher accuracy and enable a higher level of automation. |
| Power-Domain Interference Graph Estimation for Multi-hop BLE Networks | Haifeng Jia, Yichen Wei, Yibo Pi, Cailian Chen | 2024-08-21 | 下载 | Traditional wisdom for network management allocates network resources separately for the measurement and communication tasks. Heavy measurement tasks may compete limited resources with communication t... |
| Security Evaluation in Software-Defined Networks | Igor Ivkić, Dominik Thiede, Nicholas Race, Matthew Broadbent, Antonios Gouglidis | 2024-08-21 | 下载 | Cloud computing has grown in importance in recent years which has led to a significant increase in Data Centre (DC) network requirements. A major driver of this change is virtualisation, which allows ... |
| A Cost-effective Edge Computing Gateway for Smart Buildings | Simon Madsen, Benjamin Staugaard, Zheng Ma, Salman Yussof, Bo Jørgensen | 2024-08-21 | 下载 | The retrofitting of existing buildings with building management systems presents significant challenges, primarily due to the need for labor and cost efficiency. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Telepathic Datacenters: Fast RPCs using Shared CXL Memory | Suyash Mahar, Ehsan Hajyjasini, Seungjin Lee, Zifeng Zhang, Mingyao Shen, Steven Swanson | 2024-08-21 | 下载 | Datacenter applications often rely on remote procedure calls (RPCs) for fast, efficient, and secure communication. However, RPCs are slow, inefficient, and hard to use as they require expensive serial... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| RAO-SS: A Prototype of Run-time Auto-tuning Facility for Sparse Direct Solvers | Takahiro Katagiri, Yoshinori Ishii, Hiroki Honda | 2024-08-21 | 下载 | In this paper, a run-time auto-tuning method for performance parameters according to input matrices is proposed. RAO-SS (Run-time Auto-tuning Optimizer for Sparse Solvers), which is a prototype of aut... |