2024-03-27

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
SMOF: Streaming Modern CNNs on FPGAs with Smart Off-Chip Eviction	Petros Toupas, Zhewen Yu, Christos-Savvas Bouganis, Dimitrios Tzovaras	2024-03-27	下载	Convolutional Neural Networks (CNNs) have demonstrated their effectiveness in numerous vision tasks. However, their high processing requirements necessitate efficient hardware acceleration to meet the...
Four Formal Models of IEEE 1394 Link Layer	Hubert Garavel, Bas Luttik	2024-03-27	下载	We revisit the IEEE 1394 high-performance serial bus ("FireWire"), which became a success story in formal methods after three PhD students, by using process algebra and model checking, detected a dead...
Testing Resource Isolation for System-on-Chip Architectures	Philippe Ledent, Radu Mateescu, Wendelin Serwe	2024-03-27	下载	Ensuring resource isolation at the hardware level is a crucial step towards more security inside the Internet of Things. Even though there is still no generally accepted technique to generate appropr...
NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering	Zhe Zhou, Yiqi Chen, Tao Zhang, Yang Wang, Ran Shu, Shuotao Xu, Peng Cheng, Lei Qu, Yongqiang Xiong, Jie Zhang, Guangyu Sun	2024-03-27	下载	The Compute Express Link (CXL) interconnect makes it feasible to integrate diverse types of memory into servers via its byte-addressable SerDes links.
Annotating Slack Directly on Your Verilog: Fine-Grained RTL Timing Evaluation for Early Optimization	Wenji Fang, Shang Liu, Hongce Zhang, Zhiyao Xie	2024-03-27	下载	In digital IC design, compared with post-synthesis netlists or layouts, the early register-transfer level (RTL) stage offers greater optimization flexibility for both designers and EDA tools.
Optimizing Communication for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCL	Marius Meyer, Tobias Kenter, Lucian Petrica, Kenneth O'Brien, Michaela Blott, Christian Plessl	2024-03-27	下载	Most FPGA boards in the HPC domain are well-suited for parallel scaling because of the direct integration of versatile and high-throughput network ports.
Merits of Time-Domain Computing for VMM -- A Quantitative Comparison	Florian Freye, Jie Lou, Christian Lanius, Tobias Gemmeke	2024-03-27	下载	Vector-matrix-multiplication (VMM) accel-erators have gained a lot of traction, especially due to therise of convolutional neural networks (CNNs) and the desireto compute them on the edge.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Orchestrating Mixed-Criticality Cloud Workloads in Reconfigurable Manufacturing Systems	Marco Barletta, Marcello Cinque, Davide De Vita	2024-03-27	下载	The adoption of cloud computing technologies in the industry is paving the way to new manufacturing paradigms. In this paper we propose a model to optimize the orchestration of workloads with differen...
Resource Allocation in Large Language Model Integrated 6G Vehicular Networks	Chang Liu, Jun Zhao	2024-03-27	下载	In the upcoming 6G era, vehicular networks are shifting from simple Vehicle-to-Vehicle (V2V) communication to the more complex Vehicle-to-Everything (V2X) connectivity.
Modelling the Raft Distributed Consensus Protocol in mCRL2	Parth Bora, Pham Duc Minh, Tim A. C. Willemse	2024-03-27	下载	The consensus problem is a fundamental problem in distributed systems. It involves a set of actors, or entities, that need to agree on some values or decisions.
Superior Parallel Big Data Clustering through Competitive Stochastic Sample Size Optimization in Big-means	Rustam Mussabayev, Ravil Mussabayev	2024-03-27	下载	This paper introduces a novel K-means clustering algorithm, an advancement on the conventional Big-means methodology. The proposed method efficiently integrates parallel processing, stochastic samplin...
JumpBackHash: Say Goodbye to the Modulo Operation to Distribute Keys Uniformly to Buckets	Otmar Ertl	2024-03-27	下载	Introduction. Distributed data processing and storage systems require efficient methods to distribute keys across buckets. While simple and fast, the traditional modulo-based mapping is unstable when ...
Improving Efficiency of Parallel Across the Method Spectral Deferred Corrections	Gayatri Čaklović, Thibaut Lunet, Sebastian Götschel, Daniel Ruprecht	2024-03-27	下载	Parallel-across-the method time integration can provide small scale parallelism when solving initial value problems. Spectral deferred corrections (SDC) with a diagonal sweeper, which is closely relat...
Enhanced OpenMP Algorithm to Compute All-Pairs Shortest Path on x86 Architectures	Sergio Calderón, Enzo Rucci, Franco Chichizola	2024-03-27	下载	Graphs have become a key tool when modeling and solving problems in different areas. The Floyd-Warshall (FW) algorithm computes the shortest path between all pairs of vertices in a graph and is employ...
Optimal Resource Efficiency with Fairness in Heterogeneous GPU Clusters	Zizhao Mo, Huanle Xu, Wing Cheong Lau	2024-03-27	下载	Ensuring the highest training throughput to maximize resource efficiency, while maintaining fairness among users, is critical for deep learning (DL) training in heterogeneous GPU clusters.
Distributed Maximum Consensus over Noisy Links	Ehsan Lari, Reza Arablouei, Naveen K. D. Venkategowda, Stefan Werner	2024-03-27	下载	We introduce a distributed algorithm, termed noise-robust distributed maximum consensus (RD-MC), for estimating the maximum value within a multi-agent network in the presence of noisy communication li...
Optimizing Communication for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCL	Marius Meyer, Tobias Kenter, Lucian Petrica, Kenneth O'Brien, Michaela Blott, Christian Plessl	2024-03-27	下载	Most FPGA boards in the HPC domain are well-suited for parallel scaling because of the direct integration of versatile and high-throughput network ports.
Privacy-Preserving Distributed Nonnegative Matrix Factorization	Ehsan Lari, Reza Arablouei, Stefan Werner	2024-03-27	下载	Nonnegative matrix factorization (NMF) is an effective data representation tool with numerous applications in signal processing and machine learning.
HotStuff-2 vs. HotStuff: The Difference and Advantage	Siyuan Zhao, Yanqi Wu, Zheng Wang	2024-03-27	下载	Byzantine consensus protocols are essential in blockchain technology. The widely recognized HotStuff protocol uses cryptographic measures for efficient view changes and reduced communication complexit...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Peregrine: ML-based Malicious Traffic Detection for Terabit Networks	João Romeiras Amado, Francisco Pereira, David Pissarra, Salvatore Signorello, Miguel Correia, Fernando M. V. Ramos	2024-03-27	下载	Malicious traffic detectors leveraging machine learning (ML), namely those incorporating deep learning techniques, exhibit impressive detection capabilities across multiple attacks.
Fast Decision Algorithms for Efficient Access Point Assignment in SDN-Controlled Wireless Access Networks	Pablo Fondo-Ferreiro, Saber Mhiri, Cristina López-Bravo, Francisco Javier González-Castaño, Felipe Gil-Castiñeira	2024-03-27	下载	Global optimization of access point (AP) assignment to user terminals requires efficient monitoring of user behavior, fast decision algorithms, efficient control signaling, and fast AP reassignment me...
qIoV: A Quantum-Driven Internet-of-Vehicles-Based Approach for Environmental Monitoring and Rapid Response Systems	Ankur Nahar, Koustav Kumar Mondal, Debasis Das, Rajkumar Buyya	2024-03-27	下载	This research addresses the critical necessity for advanced rapid response operations in managing a spectrum of environmental hazards. We propose a novel framework, qIoV that integrates quantum comput...
How to Cache Important Contents for Multi-modal Service in Dynamic Networks: A DRL-based Caching Scheme	Zhe Zhang, Marc St-Hilaire, Xin Wei, Haiwei Dong, Abdulmotaleb El Saddik	2024-03-27	下载	With the continuous evolution of networking technologies, multi-modal services that involve video, audio, and haptic contents are expected to become the dominant multimedia service in the near future.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Decision-Epoch Matters: Unveiling its Impact on the Stability of Scheduling with Randomly Varying Connectivity	Nahuel Soprano-Loto, Urtzi Ayesta, Matthieu Jonckheere, Ina Maria Verloop	2024-03-27	下载	A classical queuing theory result states that in a parallel-queue single-server model, the maximum stability region does not depend on the scheduling decision epochs, and in particular is the same for...