Skip to content

2024-02-21

cs.AR - Architecture

标题作者发布日期PDF摘要
ModSRAM: Algorithm-Hardware Co-Design for Large Number Modular Multiplication in SRAMJonathan Ku, Junyao Zhang, Haoxuan Shan, Saichand Samudrala, Jiawen Wu, Qilin Zheng, Ziru Li, JV Rajendran, Yiran Chen2024-02-21下载Elliptic curve cryptography (ECC) is widely used in security applications such as public key cryptography (PKC) and zero-knowledge proofs (ZKP).
Estimation of Energy-dissipation Lower-bounds for Neuromorphic Learning-in-memoryZihao Chen, Faiek Ahsan, Johannes Leugering, Gert Cauwenberghs, Shantanu Chakrabartty2024-02-21下载Neuromorphic or neurally-inspired optimizers rely on local but parallel parameter updates to solve problems that range from quadratic programming to Ising machines.
Identifying Unnecessary 3D Gaussians using Clustering for Fast Rendering of 3D Gaussian SplattingJoongho Jo, Hyeongwon Kim, Jongsun Park2024-02-21下载3D Gaussian splatting (3D-GS) is a new rendering approach that outperforms the neural radiance field (NeRF) in terms of both speed and image quality.
Guac: Energy-Aware and SSA-Based Generation of Coarse-Grained Merged Accelerators from LLVM-IRIulian Brumar, Rodrigo Rocha, Alex Bernat, Devashree Tripathy, David Brooks, Gu-Yeon Wei2024-02-21下载Designing accelerators for resource- and power-constrained applications is a daunting task. High-level Synthesis (HLS) addresses these constraints through resource sharing, an optimization at the HLS ...
Benchmarking and Dissecting the Nvidia Hopper GPU ArchitectureWeile Luo, Ruibo Fan, Zeyu Li, Dayou Du, Qiang Wang, Xiaowen Chu2024-02-21下载Graphics processing units (GPUs) are continually evolving to cater to the computational demands of contemporary general-purpose workloads, particularly those driven by artificial intelligence (AI) uti...
PaCKD: Pattern-Clustered Knowledge Distillation for Compressing Memory Access Prediction ModelsNeelesh Gupta, Pengmiao Zhang, Rajgopal Kannan, Viktor Prasanna2024-02-21下载Deep neural networks (DNNs) have proven to be effective models for accurate Memory Access Prediction (MAP), a critical task in mitigating memory latency through data prefetching.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Efficient Wait-Free Linearizable Implementations of Approximate Bounded Counters Using Read-Write RegistersColette Johnen, Adnane Khattabi, Alessia Milani, Jennifer L. Welch2024-02-21下载Relaxing the sequential specification of a shared object is a way to obtain an implementation with better performance compared to implementing the original specification.
Unveiling Crowdfunding Futures: Analyzing Campaign Outcomes through Distributed Models and Big Data PerspectivesGiuseppe Pipitò, Emanuele Macca2024-02-21下载Crowdfunding has emerged as a widespread strategy for startups seeking financing, particularly through reward-based methods. However, understanding its economic impact at both micro and macro levels r...
Formal Definitions and Performance Comparison of Consistency Models for Parallel File SystemsChen Wang, Kathryn Mohror, Marc Snir2024-02-21下载The semantics of HPC storage systems are defined by the consistency models to which they abide. Storage consistency models have been less studied than their counterparts in memory systems, with the ex...
FedADMM-InSa: An Inexact and Self-Adaptive ADMM for Federated LearningYongcun Song, Ziqi Wang, Enrique Zuazua2024-02-21下载Federated learning (FL) is a promising framework for learning from distributed data while maintaining privacy. The development of efficient FL algorithms encounters various challenges, including heter...
On Distributed Computation of the Minimum Triangle Edge TransversalKeren Censor-Hillel, Majd Khoury2024-02-21下载The distance of a graph from being triangle-free is a fundamental graph parameter, counting the number of edges that need to be removed from a graph in order for it to become triangle-free.
Multi-Agent Online Graph Exploration on Cycles and Tadpole GraphsErik van den Akker, Kevin Buchin, Klaus-Tycho Foerster2024-02-21下载We study the problem of multi-agent online graph exploration, in which a team of k agents has to explore a given graph, starting and ending on the same node. The graph is initially unknown.
Preserving Near-Optimal Gradient Sparsification Cost for Scalable Distributed Deep LearningDaegun Yoon, Sangyoon Oh2024-02-21下载Communication overhead is a major obstacle to scaling distributed training systems. Gradient sparsification is a potential optimization approach to reduce the communication volume without significant ...
Adaptive Massively Parallel Coloring in Sparse GraphsRustam Latypov, Yannic Maus, Shreyas Pai, Jara Uitto2024-02-21下载Classic symmetry-breaking problems on graphs have gained a lot of attention in models of modern parallel computation. The Adaptive Massively Parallel Computation (AMPC) is a model that captures the ce...
Strong Linearizability using Primitives with Consensus Number 2Hagit Attiya, Armando Castañeda, Constantin Enea2024-02-21下载A powerful tool for designing complex concurrent programs is through composition with object implementations from lower-level primitives. Strongly-linearizable implementations allow to preserve hyper-...
FinGPT-HPC: Efficient Pretraining and Finetuning Large Language Models for Financial Applications with High-Performance ComputingXiao-Yang Liu, Jie Zhang, Guoxuan Wang, Weiqing Tong, Anwar Walid2024-02-21下载Large language models (LLMs) are computationally intensive. The computation workload and the memory footprint grow quadratically with the dimension (layer width).
Multitier Service Migration Framework Based on Mobility Prediction in Mobile Edge ComputingRun Yang, Hui He, Weizhe Zhang2024-02-21下载Mobile edge computing (MEC) pushes computing resources to the edge of the network and distributes them at the edge of the mobile network. Offloading computing tasks to the edge instead of the cloud ca...
MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via Automating Deep Neural Network Porting for Mobile DeploymentHongtao Huang, Xiaojun Chang, Wen Hu, Lina Yao2024-02-21下载Recent years have seen the explosion of edge intelligence with powerful Deep Neural Networks (DNNs). One popular scheme is training DNNs on powerful cloud servers and subsequently porting them to mobi...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Using Harmonics for Low-Cost JammingVasilis Ieropoulos, Eirini Anthi2024-02-21下载The digitalisation of the modern schooling system has led to multiple schools and organisations buying similar hardware. Electronic equipment like wireless microphones, projectors, touchscreen display...
Multitier Service Migration Framework Based on Mobility Prediction in Mobile Edge ComputingRun Yang, Hui He, Weizhe Zhang2024-02-21下载Mobile edge computing (MEC) pushes computing resources to the edge of the network and distributes them at the edge of the mobile network. Offloading computing tasks to the edge instead of the cloud ca...
Computation Offloading for Multi-server Multi-access Edge Vehicular Networks: A DDQN-based MethodSiyu Wang, Bo Yang, Zhiwen Yu, Xuelin Cao, Yan Zhang, Chau Yuen2024-02-21下载In this paper, we investigate a multi-user offloading problem in the overlapping domain of a multi-server mobile edge computing system. We divide the original problem into two stages: the offloading d...
SISSA: Real-time Monitoring of Hardware Functional Safety and Cybersecurity with In-vehicle SOME/IP Ethernet TrafficQi Liu, Xingyu Li, Ke Sun, Yufeng Li, Yanchen Liu2024-02-21下载Scalable service-Oriented Middleware over IP (SOME/IP) is an Ethernet communication standard protocol in the Automotive Open System Architecture (AUTOSAR), promoting ECU-to-ECU communication over the ...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Formal Definitions and Performance Comparison of Consistency Models for Parallel File SystemsChen Wang, Kathryn Mohror, Marc Snir2024-02-21下载The semantics of HPC storage systems are defined by the consistency models to which they abide. Storage consistency models have been less studied than their counterparts in memory systems, with the ex...

基于 VitePress 构建