Skip to content

2024-03-21

cs.AR - Architecture

标题作者发布日期PDF摘要
Beehive: A Flexible Network Stack for Direct-Attached AcceleratorsKatie Lim, Matthew Giordano, Theano Stavrinos, Irene Zhang, Jacob Nelson, Baris Kasikci, Tom Anderson2024-03-21下载Direct-attached accelerators, where application accelerators are directly connected to the datacenter network via a hardware network stack, offer substantial benefits in terms of reduced latency, CPU ...
DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video AnalyticsYoonsung Kim, Changhun Oh, Jinwoo Hwang, Wonung Kim, Seongryong Oh, Yubin Lee, Hardik Sharma, Amir Yazdanbakhsh, Jongse Park2024-03-21下载Deep neural network (DNN) video analytics is crucial for autonomous systems such as self-driving vehicles, unmanned aerial vehicles (UAVs), and security robots.
E-Syn: E-Graph Rewriting with Technology-Aware Cost Functions for Logic SynthesisChen Chen, Guangyu Hu, Dongsheng Zuo, Cunxi Yu, Yuzhe Ma, Hongce Zhang2024-03-21下载Logic synthesis plays a crucial role in the digital design flow. It has a decisive influence on the final Quality of Results (QoR) of the circuit implementations.
AI and Memory WallAmir Gholami, Zhewei Yao, Sehoon Kim, Coleman Hooper, Michael W. Mahoney, Kurt Keutzer2024-03-21下载The availability of unprecedented unsupervised training data, along with neural scaling laws, has resulted in an unprecedented surge in model size and compute requirements for serving/training LLMs.
Accelerating ViT Inference on FPGA through Static and Dynamic PruningDhruv Parikh, Shouyi Li, Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna2024-03-21下载Vision Transformers (ViTs) have achieved state-of-the-art accuracy on various computer vision tasks. However, their high computational complexity prevents them from being applied to many real-world ap...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Blockchain e Sistemas Distribuídos: conceitos básicos e implicaçõesM. Witter, A. Rodrigo De Vit2024-03-21下载Blockchain technology has emerged as a necessity for the decentralization of payment methods and transactions, but it has brought with it many properties of distributed systems that have made it a cru...
iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse OperationsMd Saidul Hoque Anik, Pranav Badhe, Rohit Gampa, Ariful Azad2024-03-21下载Core computations in Graph Neural Network (GNN) training and inference are often mapped to sparse matrix operations such as sparse-dense matrix multiplication (SpMM).
CASPER: Carbon-Aware Scheduling and Provisioning for Distributed Web ServicesAbel Souza, Shruti Jasoria, Basundhara Chakrabarty, Alexander Bridgwater, Axel Lundberg, Filip Skogh, Ahmed Ali-Eldin, David Irwin, Prashant Shenoy2024-03-21下载There has been a significant societal push towards sustainable practices, including in computing. Modern interactive workloads such as geo-distributed web-services exhibit various spatiotemporal and p...
History-Independent Concurrent ObjectsHagit Attiya, Michael A. Bender, Martin Farach-Colton, Rotem Oshman, Noa Schiller2024-03-21下载A data structure is called history independent if its internal memory representation does not reveal the history of operations applied to it, only its current state.
FedMef: Towards Memory-efficient Federated Dynamic PruningHong Huang, Weiming Zhuang, Chen Chen, Lingjuan Lyu2024-03-21下载Federated learning (FL) promotes decentralized training while prioritizing data confidentiality. However, its application on resource-constrained devices is challenging due to the high demand for comp...
Loop Improvement: An Efficient Approach for Extracting Shared Features from Heterogeneous Data without Central ServerFei Li, Chu Kiong Loo, Wei Shiung Liew, Xiaofeng Liu2024-03-21下载In federated learning, data heterogeneity significantly impacts performance. A typical solution involves segregating these parameters into shared and personalized components, a concept also relevant i...
Accelerating Time-to-Science by Streaming Detector Data Directly into Perlmutter Compute NodesSamuel S. Welborn, Bjoern Enders, Chris Harris, Peter Ercius, Deborah J. Bard2024-03-21下载Recent advancements in detector technology have significantly increased the size and complexity of experimental data, and high-performance computing (HPC) provides a path towards more efficient and ti...
Adversary-Augmented Simulation to evaluate fairness on HyperLedger FabricErwan Mahe, Rouwaida Abdallah, Sara Tucci-Piergiovanni, Pierre-Yves Piriou2024-03-21下载This paper presents a novel adversary model specifically tailored to distributed systems, aiming to assess the security of blockchain networks.
Towards Single Slot Finality: Evaluating Consensus Mechanisms and Methods for Faster Ethereum FinalityLincoln Murr2024-03-21下载Ethereum's current Gasper consensus mechanism, which combines the Latest Message Driven Greediest Heaviest Observed SubTree (LMD-GHOST) fork choice rule with the probabilistic Casper the Friendly Fina...
AI and Memory WallAmir Gholami, Zhewei Yao, Sehoon Kim, Coleman Hooper, Michael W. Mahoney, Kurt Keutzer2024-03-21下载The availability of unprecedented unsupervised training data, along with neural scaling laws, has resulted in an unprecedented surge in model size and compute requirements for serving/training LLMs.
On the Power of Quantum Distributed ProofsAtsuya Hasegawa, Srijita Kundu, Harumichi Nishimura2024-03-21下载Quantum nondeterministic distributed computing was recently introduced as dQMA (distributed quantum Merlin-Arthur) protocols by Fraigniaud, Le Gall, Nishimura and Paz (ITCS 2021).
Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible InstancesJiangfei Duan, Ziang Song, Xupeng Miao, Xiaoli Xi, Dahua Lin, Harry Xu, Minjia Zhang, Zhihao Jia2024-03-21下载Deep neural networks (DNNs) are becoming progressively large and costly to train. This paper aims to reduce DNN training costs by leveraging preemptible instances on modern clouds, which can be alloca...
Accelerating ViT Inference on FPGA through Static and Dynamic PruningDhruv Parikh, Shouyi Li, Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna2024-03-21下载Vision Transformers (ViTs) have achieved state-of-the-art accuracy on various computer vision tasks. However, their high computational complexity prevents them from being applied to many real-world ap...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
CASPER: Carbon-Aware Scheduling and Provisioning for Distributed Web ServicesAbel Souza, Shruti Jasoria, Basundhara Chakrabarty, Alexander Bridgwater, Axel Lundberg, Filip Skogh, Ahmed Ali-Eldin, David Irwin, Prashant Shenoy2024-03-21下载There has been a significant societal push towards sustainable practices, including in computing. Modern interactive workloads such as geo-distributed web-services exhibit various spatiotemporal and p...
Optimizing queues with deadlines under infrequent monitoringFaraz Farahvash, Ao Tang2024-03-21下载In this paper, we aim to improve the percentage of packets meeting their deadline in discrete-time M/M/1 queues with infrequent monitoring. More specifically, we look into policies that only monitor t...
A Mathematical Introduction to Deep Reinforcement Learning for 5G/6G ApplicationsFarhad Rezazadeh2024-03-21下载Algorithmic innovation can unleash the potential of the beyond 5G (B5G)/6G communication systems. Artificial intelligence (AI)-driven zero-touch network slicing is envisaged as a promising cutting-edg...
Accelerating Time-to-Science by Streaming Detector Data Directly into Perlmutter Compute NodesSamuel S. Welborn, Bjoern Enders, Chris Harris, Peter Ercius, Deborah J. Bard2024-03-21下载Recent advancements in detector technology have significantly increased the size and complexity of experimental data, and high-performance computing (HPC) provides a path towards more efficient and ti...

cs.PF - Performance

标题作者发布日期PDF摘要
iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse OperationsMd Saidul Hoque Anik, Pranav Badhe, Rohit Gampa, Ariful Azad2024-03-21下载Core computations in Graph Neural Network (GNN) training and inference are often mapped to sparse matrix operations such as sparse-dense matrix multiplication (SpMM).
CASPER: Carbon-Aware Scheduling and Provisioning for Distributed Web ServicesAbel Souza, Shruti Jasoria, Basundhara Chakrabarty, Alexander Bridgwater, Axel Lundberg, Filip Skogh, Ahmed Ali-Eldin, David Irwin, Prashant Shenoy2024-03-21下载There has been a significant societal push towards sustainable practices, including in computing. Modern interactive workloads such as geo-distributed web-services exhibit various spatiotemporal and p...

基于 VitePress 构建