2024-02-21

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
ModSRAM: Algorithm-Hardware Co-Design for Large Number Modular Multiplication in SRAM	Jonathan Ku, Junyao Zhang, Haoxuan Shan, Saichand Samudrala, Jiawen Wu, Qilin Zheng, Ziru Li, JV Rajendran, Yiran Chen	2024-02-21	下载	Elliptic curve cryptography (ECC) is widely used in security applications such as public key cryptography (PKC) and zero-knowledge proofs (ZKP).
Estimation of Energy-dissipation Lower-bounds for Neuromorphic Learning-in-memory	Zihao Chen, Faiek Ahsan, Johannes Leugering, Gert Cauwenberghs, Shantanu Chakrabartty	2024-02-21	下载	Neuromorphic or neurally-inspired optimizers rely on local but parallel parameter updates to solve problems that range from quadratic programming to Ising machines.
Identifying Unnecessary 3D Gaussians using Clustering for Fast Rendering of 3D Gaussian Splatting	Joongho Jo, Hyeongwon Kim, Jongsun Park	2024-02-21	下载	3D Gaussian splatting (3D-GS) is a new rendering approach that outperforms the neural radiance field (NeRF) in terms of both speed and image quality.
Guac: Energy-Aware and SSA-Based Generation of Coarse-Grained Merged Accelerators from LLVM-IR	Iulian Brumar, Rodrigo Rocha, Alex Bernat, Devashree Tripathy, David Brooks, Gu-Yeon Wei	2024-02-21	下载	Designing accelerators for resource- and power-constrained applications is a daunting task. High-level Synthesis (HLS) addresses these constraints through resource sharing, an optimization at the HLS ...
Benchmarking and Dissecting the Nvidia Hopper GPU Architecture	Weile Luo, Ruibo Fan, Zeyu Li, Dayou Du, Qiang Wang, Xiaowen Chu	2024-02-21	下载	Graphics processing units (GPUs) are continually evolving to cater to the computational demands of contemporary general-purpose workloads, particularly those driven by artificial intelligence (AI) uti...
PaCKD: Pattern-Clustered Knowledge Distillation for Compressing Memory Access Prediction Models	Neelesh Gupta, Pengmiao Zhang, Rajgopal Kannan, Viktor Prasanna	2024-02-21	下载	Deep neural networks (DNNs) have proven to be effective models for accurate Memory Access Prediction (MAP), a critical task in mitigating memory latency through data prefetching.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Efficient Wait-Free Linearizable Implementations of Approximate Bounded Counters Using Read-Write Registers	Colette Johnen, Adnane Khattabi, Alessia Milani, Jennifer L. Welch	2024-02-21	下载	Relaxing the sequential specification of a shared object is a way to obtain an implementation with better performance compared to implementing the original specification.
Unveiling Crowdfunding Futures: Analyzing Campaign Outcomes through Distributed Models and Big Data Perspectives	Giuseppe Pipitò, Emanuele Macca	2024-02-21	下载	Crowdfunding has emerged as a widespread strategy for startups seeking financing, particularly through reward-based methods. However, understanding its economic impact at both micro and macro levels r...
Formal Definitions and Performance Comparison of Consistency Models for Parallel File Systems	Chen Wang, Kathryn Mohror, Marc Snir	2024-02-21	下载	The semantics of HPC storage systems are defined by the consistency models to which they abide. Storage consistency models have been less studied than their counterparts in memory systems, with the ex...
FedADMM-InSa: An Inexact and Self-Adaptive ADMM for Federated Learning	Yongcun Song, Ziqi Wang, Enrique Zuazua	2024-02-21	下载	Federated learning (FL) is a promising framework for learning from distributed data while maintaining privacy. The development of efficient FL algorithms encounters various challenges, including heter...
On Distributed Computation of the Minimum Triangle Edge Transversal	Keren Censor-Hillel, Majd Khoury	2024-02-21	下载	The distance of a graph from being triangle-free is a fundamental graph parameter, counting the number of edges that need to be removed from a graph in order for it to become triangle-free.
Multi-Agent Online Graph Exploration on Cycles and Tadpole Graphs	Erik van den Akker, Kevin Buchin, Klaus-Tycho Foerster	2024-02-21	下载	We study the problem of multi-agent online graph exploration, in which a team of k agents has to explore a given graph, starting and ending on the same node. The graph is initially unknown.
Preserving Near-Optimal Gradient Sparsification Cost for Scalable Distributed Deep Learning	Daegun Yoon, Sangyoon Oh	2024-02-21	下载	Communication overhead is a major obstacle to scaling distributed training systems. Gradient sparsification is a potential optimization approach to reduce the communication volume without significant ...
Adaptive Massively Parallel Coloring in Sparse Graphs	Rustam Latypov, Yannic Maus, Shreyas Pai, Jara Uitto	2024-02-21	下载	Classic symmetry-breaking problems on graphs have gained a lot of attention in models of modern parallel computation. The Adaptive Massively Parallel Computation (AMPC) is a model that captures the ce...
Strong Linearizability using Primitives with Consensus Number 2	Hagit Attiya, Armando Castañeda, Constantin Enea	2024-02-21	下载	A powerful tool for designing complex concurrent programs is through composition with object implementations from lower-level primitives. Strongly-linearizable implementations allow to preserve hyper-...
FinGPT-HPC: Efficient Pretraining and Finetuning Large Language Models for Financial Applications with High-Performance Computing	Xiao-Yang Liu, Jie Zhang, Guoxuan Wang, Weiqing Tong, Anwar Walid	2024-02-21	下载	Large language models (LLMs) are computationally intensive. The computation workload and the memory footprint grow quadratically with the dimension (layer width).
Multitier Service Migration Framework Based on Mobility Prediction in Mobile Edge Computing	Run Yang, Hui He, Weizhe Zhang	2024-02-21	下载	Mobile edge computing (MEC) pushes computing resources to the edge of the network and distributes them at the edge of the mobile network. Offloading computing tasks to the edge instead of the cloud ca...
MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via Automating Deep Neural Network Porting for Mobile Deployment	Hongtao Huang, Xiaojun Chang, Wen Hu, Lina Yao	2024-02-21	下载	Recent years have seen the explosion of edge intelligence with powerful Deep Neural Networks (DNNs). One popular scheme is training DNNs on powerful cloud servers and subsequently porting them to mobi...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
Using Harmonics for Low-Cost Jamming	Vasilis Ieropoulos, Eirini Anthi	2024-02-21	下载	The digitalisation of the modern schooling system has led to multiple schools and organisations buying similar hardware. Electronic equipment like wireless microphones, projectors, touchscreen display...
Multitier Service Migration Framework Based on Mobility Prediction in Mobile Edge Computing	Run Yang, Hui He, Weizhe Zhang	2024-02-21	下载	Mobile edge computing (MEC) pushes computing resources to the edge of the network and distributes them at the edge of the mobile network. Offloading computing tasks to the edge instead of the cloud ca...
Computation Offloading for Multi-server Multi-access Edge Vehicular Networks: A DDQN-based Method	Siyu Wang, Bo Yang, Zhiwen Yu, Xuelin Cao, Yan Zhang, Chau Yuen	2024-02-21	下载	In this paper, we investigate a multi-user offloading problem in the overlapping domain of a multi-server mobile edge computing system. We divide the original problem into two stages: the offloading d...
SISSA: Real-time Monitoring of Hardware Functional Safety and Cybersecurity with In-vehicle SOME/IP Ethernet Traffic	Qi Liu, Xingyu Li, Ke Sun, Yufeng Li, Yanchen Liu	2024-02-21	下载	Scalable service-Oriented Middleware over IP (SOME/IP) is an Ethernet communication standard protocol in the Automotive Open System Architecture (AUTOSAR), promoting ECU-to-ECU communication over the ...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Formal Definitions and Performance Comparison of Consistency Models for Parallel File Systems	Chen Wang, Kathryn Mohror, Marc Snir	2024-02-21	下载	The semantics of HPC storage systems are defined by the consistency models to which they abide. Storage consistency models have been less studied than their counterparts in memory systems, with the ex...