Appearance
2024-01-30
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Qplacer: Frequency-Aware Component Placement for Superconducting Quantum Computers | Junyao Zhang, Hanrui Wang, Qi Ding, Jiaqi Gu, Reouven Assouly, William D. Oliver, Song Han, Kenneth R. Brown, Hai "Helen" Li, Yiran Chen | 2024-01-30 | 下载 | Noisy Intermediate-Scale Quantum (NISQ) computers face a critical limitation in qubit numbers, hindering their progression towards large-scale and fault-tolerant quantum computing. |
| Using the Abstract Computer Architecture Description Language to Model AI Hardware Accelerators | Mika Markus Müller, Alexander Richard Manfred Borst, Konstantin Lübeck, Alexander Louis-Ferdinand Jung, Oliver Bringmann | 2024-01-30 | 下载 | Artificial Intelligence (AI) has witnessed remarkable growth, particularly through the proliferation of Deep Neural Networks (DNNs). These powerful models drive technological advancements across vario... |
| SAL-PIM: A Subarray-level Processing-in-Memory Architecture with LUT-based Linear Interpolation for Transformer-based Text Generation | Wontak Han, Hyunjun Cho, Donghyuk Kim, Joo-Young Kim | 2024-01-30 | 下载 | Text generation is a compelling sub-field of natural language processing, aiming to generate human-readable text from input words. In particular, the decoder-only generative models, such as generative... |
| Method for determining the acceleration of a parallel specialised computer system based on Amdahl's law | Aleksandr S. Filipchenko | 2024-01-30 | 下载 | The modification of Amdahl's law for the case of increment of processor elements in a computer system is considered. The coefficient linking accelerations of parallel and parallel specialized comp... |
| A Scalable RISC-V Vector Processor Enabling Efficient Multi-Precision DNN Inference | Chuanning Wang, Chao Fang, Xiao Wu, Zhongfeng Wang, Jun Lin | 2024-01-30 | 下载 | RISC-V processors encounter substantial challenges in deploying multi-precision deep neural networks (DNNs) due to their restricted precision support, constrained throughput, and suboptimal dataflow d... |
| WideSA: A High Array Utilization Mapping Scheme for Uniform Recurrences on the Versal ACAP Architecture | Tuo Dai, Bizhao Shi, Guojie Luo | 2024-01-30 | 下载 | The Versal Adaptive Compute Acceleration Platform (ACAP) is a new architecture that combines AI Engines (AIEs) with reconfigurable fabric. This architecture offers significant acceleration potential f... |
| T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives | Suchita Pati, Shaizeen Aga, Mahzabeen Islam, Nuwan Jayasena, Matthew D. Sinclair | 2024-01-30 | 下载 | Large Language Models increasingly rely on distributed techniques for their training and inference. These techniques require communication across devices which can reduce scaling efficiency as the num... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| CLAIRE: Scalable GPU-Accelerated Algorithms for Diffeomorphic Image Registration in 3D | Andreas Mang | 2024-01-30 | 下载 | We present our work on scalable, GPU-accelerated algorithms for diffeomorphic image registration. The associated software package is termed CLAIRE. Image registration is a non-linear inverse problem. |
| Parallelization Strategies for the Randomized Kaczmarz Algorithm on Large-Scale Dense Systems | Inês Ferreira, Juan A. Acebrón, José Monteiro | 2024-01-30 | 下载 | The Kaczmarz algorithm is an iterative technique designed to solve consistent linear systems of equations. It falls within the category of row-action methods, focusing on handling one equation per ite... |
| Rendering Wireless Environments Useful for Gradient Estimators: A Zero-Order Stochastic Federated Learning Method | Elissa Mhanna, Mohamad Assaad | 2024-01-30 | 下载 | Cross-device federated learning (FL) is a growing machine learning setting whereby multiple edge devices collaborate to train a model without disclosing their raw data. |
| Characterising resource management performance in Kubernetes | Víctor Medel, Rafael Tolosana-Calasanz, José Ángel Bañares, Unai Arronategui, Omer F. Rana | 2024-01-30 | 下载 | A key challenge for supporting elastic behaviour in cloud systems is to achieve a good performance in automated (de-)provisioning and scheduling of computing resources. |
| Identifying Quality Mersenne Twister Streams For Parallel Stochastic Simulations | Benjamin Antunes, Claude Mazel, David R. C Hill | 2024-01-30 | 下载 | The Mersenne Twister (MT) is a pseudo-random number generator (PRNG) widely used in High Performance Computing for parallel stochastic simulations. |
| GPU-Accelerated Batch-Dynamic Subgraph Matching | Linshan Qiu, Lu Chen, Hailiang Jie, Xiangyu Ke, Yunjun Gao, Yang Liu, Zetao Zhang | 2024-01-30 | 下载 | Subgraph matching has garnered increasing attention for its diverse real-world applications. Given the dynamic nature of real-world graphs, addressing evolving scenarios without incurring prohibitive ... |
| Method for determining the acceleration of a parallel specialised computer system based on Amdahl's law | Aleksandr S. Filipchenko | 2024-01-30 | 下载 | The modification of Amdahl's law for the case of increment of processor elements in a computer system is considered. The coefficient linking accelerations of parallel and parallel specialized comp... |
| Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations | Francieli Boito, Jim Brandt, Valeria Cardellini, Philip Carns, Florina M. Ciorba, Hilary Egan, Ahmed Eleliemy, Ann Gentile, Thomas Gruber, Jeff Hanson, Utz-Uwe Haus, Kevin Huck, Thomas Ilsche, Thomas Jakobsche, Terry Jones, Sven Karlsson, Abdullah Mueen, Michael Ott, Tapasya Patki, Ivy Peng, Krishnan Raghavan, Stephen Simms, Kathleen Shoga, Michael Showerman, Devesh Tiwari, Torsten Wilde, Keiji Yamamoto | 2024-01-30 | 下载 | Many High Performance Computing (HPC) facilities have developed and deployed frameworks in support of continuous monitoring and operational data analytics (MODA) to help improve efficiency and through... |
| BCM-Broadcast: A Byzantine-Tolerant Causal Broadcast Algorithm for Distributed Mobile Systems | Leila NamvariTazehkand, Saied Pashazadeh, Ali Ebnenasir | 2024-01-30 | 下载 | This paper presents an algorithm, called BCM-Broadcast, for the implementation of causal broadcast in distributed mobile systems in the presence of Byzantine failures. |
| Interactive Byzantine-Resilient Gradient Coding for General Data Assignments | Shreyas Jain, Luis Maßny, Christoph Hofmeister, Eitan Yaakobi, Rawad Bitar | 2024-01-30 | 下载 | We tackle the problem of Byzantine errors in distributed gradient descent within the Byzantine-resilient gradient coding framework. Our proposed solution can recover the exact full gradient in the pre... |
| Quantum-Secure Hybrid Blockchain System for DID-based Verifiable Random Function with NTRU Linkable Ring Signature | Bong Gon Kim, Dennis Wong, Yoon Seok Yang | 2024-01-30 | 下载 | In this study, we present a secure smart contract-based Verifiable Random Function (VRF) model, addressing the shortcomings of existing systems. |
| Computational Power of Opaque Robots | Caterina Feletti, Lucia Mambretti, Carlo Mereghetti, Beatrice Palano | 2024-01-30 | 下载 | In the field of distributed computing by robot swarms, the research comprehends manifold models where robots operate in the Euclidean plane through a sequence of look-compute-move cycles. |
| Using Sequential Runtime Distributions for the Parallel Speedup Prediction of SAT Local Search | Alejandro Arbelaez, Charlotte Truchet, Philippe Codognet | 2024-01-30 | 下载 | This paper presents a detailed analysis of the scalability and parallelization of local search algorithms for the Satisfiability problem. We propose a framework to estimate the parallel performance of... |
| SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget | Kun Wang, Jiani Cao, Zimu Zhou, Zhenjiang Li | 2024-01-30 | 下载 | Executing deep neural networks (DNNs) on edge artificial intelligence (AI) devices enables various autonomous mobile computing applications. However, the memory budget of edge AI devices restricts the... |
| EdgeOL: Efficient in-situ Online Learning on Edge Devices | Sheng Li, Geng Yuan, Yue Dai, Tianyu Wang, Yawen Wu, Alex K. Jones, Jingtong Hu, Tony, Geng, Yanzhi Wang, Bo Yuan, Yufei Ding, Xulong Tang | 2024-01-30 | 下载 | Emerging applications, such as robot-assisted eldercare and object recognition, generally employ deep learning neural networks (DNNs) and naturally require: i) handling streaming-in inference requests... |
| Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection | Liangqi Yuan, Dong-Jun Han, Su Wang, Devesh Upadhyay, Christopher G. Brinton | 2024-01-30 | 下载 | Multimodal federated learning (MFL) aims to enrich model training in FL settings where clients are collecting measurements across multiple modalities. |
| T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives | Suchita Pati, Shaizeen Aga, Mahzabeen Islam, Nuwan Jayasena, Matthew D. Sinclair | 2024-01-30 | 下载 | Large Language Models increasingly rely on distributed techniques for their training and inference. These techniques require communication across devices which can reduce scaling efficiency as the num... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Differentiated Service Entanglement Routing for Quantum Networks | Hui Han, Bo Liu, Bangying Tang, Siyu Xiong, Jinquan Huang, Wanrong Yu, Shuhui Chen | 2024-01-30 | 下载 | The entanglement distribution networks with various topologies are mainly implemented by active wavelength multiplexing routing strategies. However, designing an entanglement routing scheme, which ach... |
| Socially Aware V2X Localized QoS | Rafael Kaliski, Yue-hua Han | 2024-01-30 | 下载 | Vehicle-to-everything (V2X) is a core 5G technology. V2X and its enabler, Device-to-Device (D2D), are essential for the Internet of Things (IoT) and the Internet of Vehicles (IoV). |
| URLLC-Aware Proactive UAV Placement in Internet of Vehicles | Chen-Feng Liu, Nirmal D. Wickramasinghe, Himal A. Suraweera, Mehdi Bennis, Merouane Debbah | 2024-01-30 | 下载 | Unmanned aerial vehicles (UAVs) are envisioned to provide diverse services from the air. The service quality may rely on the wireless performance which is affected by the UAV's position. |
| Quantum -Secure -Byzantine -Colluding Private Information Retrieval | Mohamed Nomeir, Alptug Aytekin, Sennur Ulukus | 2024-01-30 | 下载 | We consider the problems arising from the presence of Byzantine servers in a quantum private information retrieval (QPIR) setting. This is the first work to precisely define what the capabilities of B... |
| Utilizing Large Language Models to Translate RFC Protocol Specifications to CPSA Definitions | Martin Duclos, Ivan A. Fernandez, Kaneesha Moore, Sudip Mittal, Edward Zieglar | 2024-01-30 | 下载 | This paper proposes the use of Large Language Models (LLMs) for translating Request for Comments (RFC) protocol specifications into a format compatible with the Cryptographic Protocol Shapes Analyzer ... |
| Evaluating ML-Based Anomaly Detection Across Datasets of Varied Integrity: A Case Study | Adrian Pekar, Richard Jozsa | 2024-01-30 | 下载 | Cybersecurity remains a critical challenge in the digital age, with network traffic flow anomaly detection being a key pivotal instrument in the fight against cyber threats. |
| Dynamic Human Digital Twin Deployment at the Edge for Task Execution: A Two-Timescale Accuracy-Aware Online Optimization | Yuye Yang, You Shi, Changyan Yi, Jun Cai, Jiawen Kang, Dusit Niyato, Xuemin, Shen | 2024-01-30 | 下载 | Human digital twin (HDT) is an emerging paradigm that bridges physical twins (PTs) with powerful virtual twins (VTs) for assisting complex task executions in human-centric services. |
| Large Multi-Modal Models (LMMs) as Universal Foundation Models for AI-Native Wireless Systems | Shengzhe Xu, Christo Kurisummoottil Thomas, Omar Hashash, Nikhil Muralidhar, Walid Saad, Naren Ramakrishnan | 2024-01-30 | 下载 | Large language models (LLMs) and foundation models have been recently touted as a game-changer for 6G systems. However, recent efforts on LLMs for wireless networks are limited to a direct application... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Realtime Facial Expression Recognition: Neuromorphic Hardware vs. Edge AI Accelerators | Heath Smith, James Seekings, Mohammadreza Mohammadi, Ramtin Zand | 2024-01-30 | 下载 | The paper focuses on real-time facial expression recognition (FER) systems as an important component in various real-world applications such as social robotics. |