Skip to content

2024-03-27

cs.AR - Architecture

标题作者发布日期PDF摘要
SMOF: Streaming Modern CNNs on FPGAs with Smart Off-Chip EvictionPetros Toupas, Zhewen Yu, Christos-Savvas Bouganis, Dimitrios Tzovaras2024-03-27下载Convolutional Neural Networks (CNNs) have demonstrated their effectiveness in numerous vision tasks. However, their high processing requirements necessitate efficient hardware acceleration to meet the...
Four Formal Models of IEEE 1394 Link LayerHubert Garavel, Bas Luttik2024-03-27下载We revisit the IEEE 1394 high-performance serial bus ("FireWire"), which became a success story in formal methods after three PhD students, by using process algebra and model checking, detected a dead...
Testing Resource Isolation for System-on-Chip ArchitecturesPhilippe Ledent, Radu Mateescu, Wendelin Serwe2024-03-27下载Ensuring resource isolation at the hardware level is a crucial step towards more security inside the Internet of Things. Even though there is still no generally accepted technique to generate appropr...
NeoMem: Hardware/Software Co-Design for CXL-Native Memory TieringZhe Zhou, Yiqi Chen, Tao Zhang, Yang Wang, Ran Shu, Shuotao Xu, Peng Cheng, Lei Qu, Yongqiang Xiong, Jie Zhang, Guangyu Sun2024-03-27下载The Compute Express Link (CXL) interconnect makes it feasible to integrate diverse types of memory into servers via its byte-addressable SerDes links.
Annotating Slack Directly on Your Verilog: Fine-Grained RTL Timing Evaluation for Early OptimizationWenji Fang, Shang Liu, Hongce Zhang, Zhiyao Xie2024-03-27下载In digital IC design, compared with post-synthesis netlists or layouts, the early register-transfer level (RTL) stage offers greater optimization flexibility for both designers and EDA tools.
Optimizing Communication for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCLMarius Meyer, Tobias Kenter, Lucian Petrica, Kenneth O'Brien, Michaela Blott, Christian Plessl2024-03-27下载Most FPGA boards in the HPC domain are well-suited for parallel scaling because of the direct integration of versatile and high-throughput network ports.
Merits of Time-Domain Computing for VMM -- A Quantitative ComparisonFlorian Freye, Jie Lou, Christian Lanius, Tobias Gemmeke2024-03-27下载Vector-matrix-multiplication (VMM) accel-erators have gained a lot of traction, especially due to therise of convolutional neural networks (CNNs) and the desireto compute them on the edge.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Orchestrating Mixed-Criticality Cloud Workloads in Reconfigurable Manufacturing SystemsMarco Barletta, Marcello Cinque, Davide De Vita2024-03-27下载The adoption of cloud computing technologies in the industry is paving the way to new manufacturing paradigms. In this paper we propose a model to optimize the orchestration of workloads with differen...
Resource Allocation in Large Language Model Integrated 6G Vehicular NetworksChang Liu, Jun Zhao2024-03-27下载In the upcoming 6G era, vehicular networks are shifting from simple Vehicle-to-Vehicle (V2V) communication to the more complex Vehicle-to-Everything (V2X) connectivity.
Modelling the Raft Distributed Consensus Protocol in mCRL2Parth Bora, Pham Duc Minh, Tim A. C. Willemse2024-03-27下载The consensus problem is a fundamental problem in distributed systems. It involves a set of actors, or entities, that need to agree on some values or decisions.
Superior Parallel Big Data Clustering through Competitive Stochastic Sample Size Optimization in Big-meansRustam Mussabayev, Ravil Mussabayev2024-03-27下载This paper introduces a novel K-means clustering algorithm, an advancement on the conventional Big-means methodology. The proposed method efficiently integrates parallel processing, stochastic samplin...
JumpBackHash: Say Goodbye to the Modulo Operation to Distribute Keys Uniformly to BucketsOtmar Ertl2024-03-27下载Introduction. Distributed data processing and storage systems require efficient methods to distribute keys across buckets. While simple and fast, the traditional modulo-based mapping is unstable when ...
Improving Efficiency of Parallel Across the Method Spectral Deferred CorrectionsGayatri Čaklović, Thibaut Lunet, Sebastian Götschel, Daniel Ruprecht2024-03-27下载Parallel-across-the method time integration can provide small scale parallelism when solving initial value problems. Spectral deferred corrections (SDC) with a diagonal sweeper, which is closely relat...
Enhanced OpenMP Algorithm to Compute All-Pairs Shortest Path on x86 ArchitecturesSergio Calderón, Enzo Rucci, Franco Chichizola2024-03-27下载Graphs have become a key tool when modeling and solving problems in different areas. The Floyd-Warshall (FW) algorithm computes the shortest path between all pairs of vertices in a graph and is employ...
Optimal Resource Efficiency with Fairness in Heterogeneous GPU ClustersZizhao Mo, Huanle Xu, Wing Cheong Lau2024-03-27下载Ensuring the highest training throughput to maximize resource efficiency, while maintaining fairness among users, is critical for deep learning (DL) training in heterogeneous GPU clusters.
Distributed Maximum Consensus over Noisy LinksEhsan Lari, Reza Arablouei, Naveen K. D. Venkategowda, Stefan Werner2024-03-27下载We introduce a distributed algorithm, termed noise-robust distributed maximum consensus (RD-MC), for estimating the maximum value within a multi-agent network in the presence of noisy communication li...
Optimizing Communication for Latency Sensitive HPC Applications on up to 48 FPGAs Using ACCLMarius Meyer, Tobias Kenter, Lucian Petrica, Kenneth O'Brien, Michaela Blott, Christian Plessl2024-03-27下载Most FPGA boards in the HPC domain are well-suited for parallel scaling because of the direct integration of versatile and high-throughput network ports.
Privacy-Preserving Distributed Nonnegative Matrix FactorizationEhsan Lari, Reza Arablouei, Stefan Werner2024-03-27下载Nonnegative matrix factorization (NMF) is an effective data representation tool with numerous applications in signal processing and machine learning.
HotStuff-2 vs. HotStuff: The Difference and AdvantageSiyuan Zhao, Yanqi Wu, Zheng Wang2024-03-27下载Byzantine consensus protocols are essential in blockchain technology. The widely recognized HotStuff protocol uses cryptographic measures for efficient view changes and reduced communication complexit...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Peregrine: ML-based Malicious Traffic Detection for Terabit NetworksJoão Romeiras Amado, Francisco Pereira, David Pissarra, Salvatore Signorello, Miguel Correia, Fernando M. V. Ramos2024-03-27下载Malicious traffic detectors leveraging machine learning (ML), namely those incorporating deep learning techniques, exhibit impressive detection capabilities across multiple attacks.
Fast Decision Algorithms for Efficient Access Point Assignment in SDN-Controlled Wireless Access NetworksPablo Fondo-Ferreiro, Saber Mhiri, Cristina López-Bravo, Francisco Javier González-Castaño, Felipe Gil-Castiñeira2024-03-27下载Global optimization of access point (AP) assignment to user terminals requires efficient monitoring of user behavior, fast decision algorithms, efficient control signaling, and fast AP reassignment me...
qIoV: A Quantum-Driven Internet-of-Vehicles-Based Approach for Environmental Monitoring and Rapid Response SystemsAnkur Nahar, Koustav Kumar Mondal, Debasis Das, Rajkumar Buyya2024-03-27下载This research addresses the critical necessity for advanced rapid response operations in managing a spectrum of environmental hazards. We propose a novel framework, qIoV that integrates quantum comput...
How to Cache Important Contents for Multi-modal Service in Dynamic Networks: A DRL-based Caching SchemeZhe Zhang, Marc St-Hilaire, Xin Wei, Haiwei Dong, Abdulmotaleb El Saddik2024-03-27下载With the continuous evolution of networking technologies, multi-modal services that involve video, audio, and haptic contents are expected to become the dominant multimedia service in the near future.

cs.PF - Performance

标题作者发布日期PDF摘要
Decision-Epoch Matters: Unveiling its Impact on the Stability of Scheduling with Randomly Varying ConnectivityNahuel Soprano-Loto, Urtzi Ayesta, Matthieu Jonckheere, Ina Maria Verloop2024-03-27下载A classical queuing theory result states that in a parallel-queue single-server model, the maximum stability region does not depend on the scheduling decision epochs, and in particular is the same for...

基于 VitePress 构建