2024-01-30

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Qplacer: Frequency-Aware Component Placement for Superconducting Quantum Computers	Junyao Zhang, Hanrui Wang, Qi Ding, Jiaqi Gu, Reouven Assouly, William D. Oliver, Song Han, Kenneth R. Brown, Hai "Helen" Li, Yiran Chen	2024-01-30	下载	Noisy Intermediate-Scale Quantum (NISQ) computers face a critical limitation in qubit numbers, hindering their progression towards large-scale and fault-tolerant quantum computing.
Using the Abstract Computer Architecture Description Language to Model AI Hardware Accelerators	Mika Markus Müller, Alexander Richard Manfred Borst, Konstantin Lübeck, Alexander Louis-Ferdinand Jung, Oliver Bringmann	2024-01-30	下载	Artificial Intelligence (AI) has witnessed remarkable growth, particularly through the proliferation of Deep Neural Networks (DNNs). These powerful models drive technological advancements across vario...
SAL-PIM: A Subarray-level Processing-in-Memory Architecture with LUT-based Linear Interpolation for Transformer-based Text Generation	Wontak Han, Hyunjun Cho, Donghyuk Kim, Joo-Young Kim	2024-01-30	下载	Text generation is a compelling sub-field of natural language processing, aiming to generate human-readable text from input words. In particular, the decoder-only generative models, such as generative...
Method for determining the acceleration of a parallel specialised computer system based on Amdahl's law	Aleksandr S. Filipchenko	2024-01-30	下载	The modification of Amdahl's law for the case of increment of processor elements in a computer system is considered. The coefficient $k$ linking accelerations of parallel and parallel specialized comp...
A Scalable RISC-V Vector Processor Enabling Efficient Multi-Precision DNN Inference	Chuanning Wang, Chao Fang, Xiao Wu, Zhongfeng Wang, Jun Lin	2024-01-30	下载	RISC-V processors encounter substantial challenges in deploying multi-precision deep neural networks (DNNs) due to their restricted precision support, constrained throughput, and suboptimal dataflow d...
WideSA: A High Array Utilization Mapping Scheme for Uniform Recurrences on the Versal ACAP Architecture	Tuo Dai, Bizhao Shi, Guojie Luo	2024-01-30	下载	The Versal Adaptive Compute Acceleration Platform (ACAP) is a new architecture that combines AI Engines (AIEs) with reconfigurable fabric. This architecture offers significant acceleration potential f...
T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives	Suchita Pati, Shaizeen Aga, Mahzabeen Islam, Nuwan Jayasena, Matthew D. Sinclair	2024-01-30	下载	Large Language Models increasingly rely on distributed techniques for their training and inference. These techniques require communication across devices which can reduce scaling efficiency as the num...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
CLAIRE: Scalable GPU-Accelerated Algorithms for Diffeomorphic Image Registration in 3D	Andreas Mang	2024-01-30	下载	We present our work on scalable, GPU-accelerated algorithms for diffeomorphic image registration. The associated software package is termed CLAIRE. Image registration is a non-linear inverse problem.
Parallelization Strategies for the Randomized Kaczmarz Algorithm on Large-Scale Dense Systems	Inês Ferreira, Juan A. Acebrón, José Monteiro	2024-01-30	下载	The Kaczmarz algorithm is an iterative technique designed to solve consistent linear systems of equations. It falls within the category of row-action methods, focusing on handling one equation per ite...
Rendering Wireless Environments Useful for Gradient Estimators: A Zero-Order Stochastic Federated Learning Method	Elissa Mhanna, Mohamad Assaad	2024-01-30	下载	Cross-device federated learning (FL) is a growing machine learning setting whereby multiple edge devices collaborate to train a model without disclosing their raw data.
Characterising resource management performance in Kubernetes	Víctor Medel, Rafael Tolosana-Calasanz, José Ángel Bañares, Unai Arronategui, Omer F. Rana	2024-01-30	下载	A key challenge for supporting elastic behaviour in cloud systems is to achieve a good performance in automated (de-)provisioning and scheduling of computing resources.
Identifying Quality Mersenne Twister Streams For Parallel Stochastic Simulations	Benjamin Antunes, Claude Mazel, David R. C Hill	2024-01-30	下载	The Mersenne Twister (MT) is a pseudo-random number generator (PRNG) widely used in High Performance Computing for parallel stochastic simulations.
GPU-Accelerated Batch-Dynamic Subgraph Matching	Linshan Qiu, Lu Chen, Hailiang Jie, Xiangyu Ke, Yunjun Gao, Yang Liu, Zetao Zhang	2024-01-30	下载	Subgraph matching has garnered increasing attention for its diverse real-world applications. Given the dynamic nature of real-world graphs, addressing evolving scenarios without incurring prohibitive ...
Method for determining the acceleration of a parallel specialised computer system based on Amdahl's law	Aleksandr S. Filipchenko	2024-01-30	下载	The modification of Amdahl's law for the case of increment of processor elements in a computer system is considered. The coefficient $k$ linking accelerations of parallel and parallel specialized comp...
Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations	Francieli Boito, Jim Brandt, Valeria Cardellini, Philip Carns, Florina M. Ciorba, Hilary Egan, Ahmed Eleliemy, Ann Gentile, Thomas Gruber, Jeff Hanson, Utz-Uwe Haus, Kevin Huck, Thomas Ilsche, Thomas Jakobsche, Terry Jones, Sven Karlsson, Abdullah Mueen, Michael Ott, Tapasya Patki, Ivy Peng, Krishnan Raghavan, Stephen Simms, Kathleen Shoga, Michael Showerman, Devesh Tiwari, Torsten Wilde, Keiji Yamamoto	2024-01-30	下载	Many High Performance Computing (HPC) facilities have developed and deployed frameworks in support of continuous monitoring and operational data analytics (MODA) to help improve efficiency and through...
BCM-Broadcast: A Byzantine-Tolerant Causal Broadcast Algorithm for Distributed Mobile Systems	Leila NamvariTazehkand, Saied Pashazadeh, Ali Ebnenasir	2024-01-30	下载	This paper presents an algorithm, called BCM-Broadcast, for the implementation of causal broadcast in distributed mobile systems in the presence of Byzantine failures.
Interactive Byzantine-Resilient Gradient Coding for General Data Assignments	Shreyas Jain, Luis Maßny, Christoph Hofmeister, Eitan Yaakobi, Rawad Bitar	2024-01-30	下载	We tackle the problem of Byzantine errors in distributed gradient descent within the Byzantine-resilient gradient coding framework. Our proposed solution can recover the exact full gradient in the pre...
Quantum-Secure Hybrid Blockchain System for DID-based Verifiable Random Function with NTRU Linkable Ring Signature	Bong Gon Kim, Dennis Wong, Yoon Seok Yang	2024-01-30	下载	In this study, we present a secure smart contract-based Verifiable Random Function (VRF) model, addressing the shortcomings of existing systems.
Computational Power of Opaque Robots	Caterina Feletti, Lucia Mambretti, Carlo Mereghetti, Beatrice Palano	2024-01-30	下载	In the field of distributed computing by robot swarms, the research comprehends manifold models where robots operate in the Euclidean plane through a sequence of look-compute-move cycles.
Using Sequential Runtime Distributions for the Parallel Speedup Prediction of SAT Local Search	Alejandro Arbelaez, Charlotte Truchet, Philippe Codognet	2024-01-30	下载	This paper presents a detailed analysis of the scalability and parallelization of local search algorithms for the Satisfiability problem. We propose a framework to estimate the parallel performance of...
SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget	Kun Wang, Jiani Cao, Zimu Zhou, Zhenjiang Li	2024-01-30	下载	Executing deep neural networks (DNNs) on edge artificial intelligence (AI) devices enables various autonomous mobile computing applications. However, the memory budget of edge AI devices restricts the...
EdgeOL: Efficient in-situ Online Learning on Edge Devices	Sheng Li, Geng Yuan, Yue Dai, Tianyu Wang, Yawen Wu, Alex K. Jones, Jingtong Hu, Tony, Geng, Yanzhi Wang, Bo Yuan, Yufei Ding, Xulong Tang	2024-01-30	下载	Emerging applications, such as robot-assisted eldercare and object recognition, generally employ deep learning neural networks (DNNs) and naturally require: i) handling streaming-in inference requests...
Communication-Efficient Multimodal Federated Learning: Joint Modality and Client Selection	Liangqi Yuan, Dong-Jun Han, Su Wang, Devesh Upadhyay, Christopher G. Brinton	2024-01-30	下载	Multimodal federated learning (MFL) aims to enrich model training in FL settings where clients are collecting measurements across multiple modalities.
T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives	Suchita Pati, Shaizeen Aga, Mahzabeen Islam, Nuwan Jayasena, Matthew D. Sinclair	2024-01-30	下载	Large Language Models increasingly rely on distributed techniques for their training and inference. These techniques require communication across devices which can reduce scaling efficiency as the num...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Differentiated Service Entanglement Routing for Quantum Networks	Hui Han, Bo Liu, Bangying Tang, Siyu Xiong, Jinquan Huang, Wanrong Yu, Shuhui Chen	2024-01-30	下载	The entanglement distribution networks with various topologies are mainly implemented by active wavelength multiplexing routing strategies. However, designing an entanglement routing scheme, which ach...
Socially Aware V2X Localized QoS	Rafael Kaliski, Yue-hua Han	2024-01-30	下载	Vehicle-to-everything (V2X) is a core 5G technology. V2X and its enabler, Device-to-Device (D2D), are essential for the Internet of Things (IoT) and the Internet of Vehicles (IoV).
URLLC-Aware Proactive UAV Placement in Internet of Vehicles	Chen-Feng Liu, Nirmal D. Wickramasinghe, Himal A. Suraweera, Mehdi Bennis, Merouane Debbah	2024-01-30	下载	Unmanned aerial vehicles (UAVs) are envisioned to provide diverse services from the air. The service quality may rely on the wireless performance which is affected by the UAV's position.
Quantum $X$ -Secure $B$ -Byzantine $T$ -Colluding Private Information Retrieval	Mohamed Nomeir, Alptug Aytekin, Sennur Ulukus	2024-01-30	下载	We consider the problems arising from the presence of Byzantine servers in a quantum private information retrieval (QPIR) setting. This is the first work to precisely define what the capabilities of B...
Utilizing Large Language Models to Translate RFC Protocol Specifications to CPSA Definitions	Martin Duclos, Ivan A. Fernandez, Kaneesha Moore, Sudip Mittal, Edward Zieglar	2024-01-30	下载	This paper proposes the use of Large Language Models (LLMs) for translating Request for Comments (RFC) protocol specifications into a format compatible with the Cryptographic Protocol Shapes Analyzer ...
Evaluating ML-Based Anomaly Detection Across Datasets of Varied Integrity: A Case Study	Adrian Pekar, Richard Jozsa	2024-01-30	下载	Cybersecurity remains a critical challenge in the digital age, with network traffic flow anomaly detection being a key pivotal instrument in the fight against cyber threats.
Dynamic Human Digital Twin Deployment at the Edge for Task Execution: A Two-Timescale Accuracy-Aware Online Optimization	Yuye Yang, You Shi, Changyan Yi, Jun Cai, Jiawen Kang, Dusit Niyato, Xuemin, Shen	2024-01-30	下载	Human digital twin (HDT) is an emerging paradigm that bridges physical twins (PTs) with powerful virtual twins (VTs) for assisting complex task executions in human-centric services.
Large Multi-Modal Models (LMMs) as Universal Foundation Models for AI-Native Wireless Systems	Shengzhe Xu, Christo Kurisummoottil Thomas, Omar Hashash, Nikhil Muralidhar, Walid Saad, Naren Ramakrishnan	2024-01-30	下载	Large language models (LLMs) and foundation models have been recently touted as a game-changer for 6G systems. However, recent efforts on LLMs for wireless networks are limited to a direct application...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Realtime Facial Expression Recognition: Neuromorphic Hardware vs. Edge AI Accelerators	Heath Smith, James Seekings, Mohammadreza Mohammadi, Ramtin Zand	2024-01-30	下载	The paper focuses on real-time facial expression recognition (FER) systems as an important component in various real-world applications such as social robotics.