Skip to content

2024-08-21

cs.AR - Architecture

标题作者发布日期PDF摘要
Floating-Point Multiply-Add with Approximate Normalization for Low-Cost Matrix EnginesKosmas Alexandridis, Christodoulos Peltekis, Dionysios Filippas, Giorgos Dimitrakopoulos2024-08-21下载The widespread adoption of machine learning algorithms necessitates hardware acceleration to ensure efficient performance. This acceleration relies on custom matrix engines that operate on full or red...
Anteumbler: Non-Invasive Antenna Orientation Error Measurement for WiFi APsDawei Yan, Panlong Yang, Fei Shang, Nikolaos M. Freris, Yubo Yan2024-08-21下载The performance of WiFi-based localization systems is affected by the spatial accuracy of WiFi AP. Compared with the imprecision of AP location and antenna separation, the imprecision of AP's or anten...
Confidential Computing on Heterogeneous CPU-GPU Systems: Survey and Future DirectionsQifan Wang, David Oswald2024-08-21下载In recent years, the widespread informatization and rapid data explosion have increased the demand for high-performance heterogeneous systems that integrate multiple computing cores such as CPUs, Grap...
In-Memory Computing Architecture for Efficient Hardware SecurityHala Ajmi, Fakhreddine Zayer, Hamdi Belgacem2024-08-21下载This paper presents an innovative approach utilizing in-memory computing (IMC) for the development and integration of AES (Advanced Encryption Standard) cipher technique.
HiMA: Hierarchical Quantum Microarchitecture for Qubit-Scaling and Quantum Process-Level ParallelismQi Zhou, Zi-Hao Mei, Han-Qing Shi, Liang-Liang Guo, Xiao-Yan Yang, Yun-Jie Wang, Xiao-Fan Xu, Cheng Xue, Wei-Cheng Kong, Jun-Chao Wang, Yu-Chun Wu, Zhao-Yun Chen, Guo-Ping Guo2024-08-21下载Quantum computing holds immense potential for addressing a myriad of intricate challenges, which is significantly amplified when scaled to thousands of qubits.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Distributed-Memory Parallel Algorithms for Sparse Matrix and Sparse Tall-and-Skinny Matrix MultiplicationIsuru Ranawaka, Md Taufique Hussain, Charles Block, Gerasimos Gerogiannis, Josep Torrellas, Ariful Azad2024-08-21下载We consider a sparse matrix-matrix multiplication (SpGEMM) setting where one matrix is square and the other is tall and skinny. This special variant, called TS-SpGEMM, has important applications in mu...
HoSZp: An Efficient Homomorphic Error-bounded Lossy Compressor for Scientific DataTripti Agarwal, Sheng Di, Jiajun Huang, Yafan Huang, Ganesh Gopalakrishnan, Robert Underwood, Kai Zhao, Xin Liang, Guanpeng Li, Franck Cappello2024-08-21下载Error-bounded lossy compression has been a critical technique to significantly reduce the sheer amounts of simulation datasets for high-performance computing (HPC) scientific applications while effect...
PAL: A Variability-Aware Policy for Scheduling ML Workloads in GPU ClustersRutwik Jain, Brandon Tran, Keting Chen, Matthew D. Sinclair, Shivaram Venkataraman2024-08-21下载Large-scale computing systems are increasingly using accelerators such as GPUs to enable peta- and exa-scale levels of compute to meet the needs of Machine Learning (ML) and scientific computing appli...
Cost-Effective Big Data Orchestration Using Dagster: A Multi-Platform ApproachHernan Picatto, Georg Heiler, Peter Klimek2024-08-21下载The rapid advancement of big data technologies has underscored the need for robust and efficient data processing solutions. Traditional Spark-based Platform-as-a-Service (PaaS) solutions, such as Data...
Understanding Data Movement in Tightly Coupled Heterogeneous Systems: A Case Study with the Grace Hopper SuperchipLuigi Fusco, Mikhail Khalilov, Marcin Chrapek, Giridhar Chukkapalli, Thomas Schulthess, Torsten Hoefler2024-08-21下载Heterogeneous supercomputers have become the standard in HPC. GPUs in particular have dominated the accelerator landscape, offering unprecedented performance in parallel workloads and unlocking new po...
High Performance Unstructured SpMM Computation Using Tensor CoresPatrik Okanovic, Grzegorz Kwasniewski, Paolo Sylos Labini, Maciej Besta, Flavio Vella, Torsten Hoefler2024-08-21下载High-performance sparse matrix-matrix (SpMM) multiplication is paramount for science and industry, as the ever-increasing sizes of data prohibit using dense data structures.
Stream-K++: Adaptive GPU GEMM Kernel Scheduling and Selection using Bloom FiltersHarisankar Sadasivan, Muhammed Emin Ozturk, Muhammad Osama, Chris Millette, Astha Rai, Maksim Podkorytov, John Afaganis, Carlus Huang, Jing Zhang, Jun Liu2024-08-21下载General matrix multiplication (GEMM) operations are the fundamental building blocks of computational domains including artificial intelligence (AI).
Distributed Path Compression for Piecewise Linear Morse-Smale Segmentations and Connected ComponentsMichael Will, Jonas Lukasczyk, Julien Tierny, Christoph Garth2024-08-21下载This paper describes the adaptation of a well-scaling parallel algorithm for computing Morse-Smale segmentations based on path compression to a distributed computational setting.
Telepathic Datacenters: Fast RPCs using Shared CXL MemorySuyash Mahar, Ehsan Hajyjasini, Seungjin Lee, Zifeng Zhang, Mingyao Shen, Steven Swanson2024-08-21下载Datacenter applications often rely on remote procedure calls (RPCs) for fast, efficient, and secure communication. However, RPCs are slow, inefficient, and hard to use as they require expensive serial...
Softening the Impact of Collisions in Contention ResolutionUmesh Biswas, Trisha Chakraborty, Maxwell Young2024-08-21下载Contention resolution addresses the problem of coordinating access to a shared communication channel. Time is discretized into synchronized slots, and a packet can be sent in any slot.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Simulators for Quantum Network Modelling: A Comprehensive ReviewOceane Bel, Mariam Kiran2024-08-21下载Quantum network research, is exploring new networking protocols, physics-based hardware and novel experiments to demonstrate how quantum distribution will work over large distances.
Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP StandardsOmar Erak, Nouf Alabbasi, Omar Alhussein, Ismail Lotfi, Amr Hussein, Sami Muhaidat, Merouane Debbah2024-08-21下载Recent studies show that large language models (LLMs) struggle with technical standards in telecommunications. We propose a fine-tuned retrieval-augmented generation (RAG) system based on the Phi-2 sm...
Anteumbler: Non-Invasive Antenna Orientation Error Measurement for WiFi APsDawei Yan, Panlong Yang, Fei Shang, Nikolaos M. Freris, Yubo Yan2024-08-21下载The performance of WiFi-based localization systems is affected by the spatial accuracy of WiFi AP. Compared with the imprecision of AP location and antenna separation, the imprecision of AP's or anten...
5G NR PRACH Detection with Convolutional Neural Networks (CNN): Overcoming Cell Interference ChallengesDesire Guel, Arsene Kabore, Didier Bassole2024-08-21下载In this paper, we present a novel approach to interference detection in 5G New Radio (5G-NR) networks using Convolutional Neural Networks (CNN).
Optimizing QoS in HD Map Updates: Cross-Layer Multi-Agent with Hierarchical and Independent LearningJeffrey Redondo, Nauman Aslam, Juan Zhang, Zhenhui Yuan2024-08-21下载The data collected by autonomous vehicle (AV) sensors such as LiDAR and cameras is crucial for creating high-definition (HD) maps to provide higher accuracy and enable a higher level of automation.
Power-Domain Interference Graph Estimation for Multi-hop BLE NetworksHaifeng Jia, Yichen Wei, Yibo Pi, Cailian Chen2024-08-21下载Traditional wisdom for network management allocates network resources separately for the measurement and communication tasks. Heavy measurement tasks may compete limited resources with communication t...
Security Evaluation in Software-Defined NetworksIgor Ivkić, Dominik Thiede, Nicholas Race, Matthew Broadbent, Antonios Gouglidis2024-08-21下载Cloud computing has grown in importance in recent years which has led to a significant increase in Data Centre (DC) network requirements. A major driver of this change is virtualisation, which allows ...
A Cost-effective Edge Computing Gateway for Smart BuildingsSimon Madsen, Benjamin Staugaard, Zheng Ma, Salman Yussof, Bo Jørgensen2024-08-21下载The retrofitting of existing buildings with building management systems presents significant challenges, primarily due to the need for labor and cost efficiency.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Telepathic Datacenters: Fast RPCs using Shared CXL MemorySuyash Mahar, Ehsan Hajyjasini, Seungjin Lee, Zifeng Zhang, Mingyao Shen, Steven Swanson2024-08-21下载Datacenter applications often rely on remote procedure calls (RPCs) for fast, efficient, and secure communication. However, RPCs are slow, inefficient, and hard to use as they require expensive serial...

cs.PF - Performance

标题作者发布日期PDF摘要
RAO-SS: A Prototype of Run-time Auto-tuning Facility for Sparse Direct SolversTakahiro Katagiri, Yoshinori Ishii, Hiroki Honda2024-08-21下载In this paper, a run-time auto-tuning method for performance parameters according to input matrices is proposed. RAO-SS (Run-time Auto-tuning Optimizer for Sparse Solvers), which is a prototype of aut...

基于 VitePress 构建