2024-03-21

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Beehive: A Flexible Network Stack for Direct-Attached Accelerators	Katie Lim, Matthew Giordano, Theano Stavrinos, Irene Zhang, Jacob Nelson, Baris Kasikci, Tom Anderson	2024-03-21	下载	Direct-attached accelerators, where application accelerators are directly connected to the datacenter network via a hardware network stack, offer substantial benefits in terms of reduced latency, CPU ...
DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video Analytics	Yoonsung Kim, Changhun Oh, Jinwoo Hwang, Wonung Kim, Seongryong Oh, Yubin Lee, Hardik Sharma, Amir Yazdanbakhsh, Jongse Park	2024-03-21	下载	Deep neural network (DNN) video analytics is crucial for autonomous systems such as self-driving vehicles, unmanned aerial vehicles (UAVs), and security robots.
E-Syn: E-Graph Rewriting with Technology-Aware Cost Functions for Logic Synthesis	Chen Chen, Guangyu Hu, Dongsheng Zuo, Cunxi Yu, Yuzhe Ma, Hongce Zhang	2024-03-21	下载	Logic synthesis plays a crucial role in the digital design flow. It has a decisive influence on the final Quality of Results (QoR) of the circuit implementations.
AI and Memory Wall	Amir Gholami, Zhewei Yao, Sehoon Kim, Coleman Hooper, Michael W. Mahoney, Kurt Keutzer	2024-03-21	下载	The availability of unprecedented unsupervised training data, along with neural scaling laws, has resulted in an unprecedented surge in model size and compute requirements for serving/training LLMs.
Accelerating ViT Inference on FPGA through Static and Dynamic Pruning	Dhruv Parikh, Shouyi Li, Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna	2024-03-21	下载	Vision Transformers (ViTs) have achieved state-of-the-art accuracy on various computer vision tasks. However, their high computational complexity prevents them from being applied to many real-world ap...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Blockchain e Sistemas Distribuídos: conceitos básicos e implicações	M. Witter, A. Rodrigo De Vit	2024-03-21	下载	Blockchain technology has emerged as a necessity for the decentralization of payment methods and transactions, but it has brought with it many properties of distributed systems that have made it a cru...
iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse Operations	Md Saidul Hoque Anik, Pranav Badhe, Rohit Gampa, Ariful Azad	2024-03-21	下载	Core computations in Graph Neural Network (GNN) training and inference are often mapped to sparse matrix operations such as sparse-dense matrix multiplication (SpMM).
CASPER: Carbon-Aware Scheduling and Provisioning for Distributed Web Services	Abel Souza, Shruti Jasoria, Basundhara Chakrabarty, Alexander Bridgwater, Axel Lundberg, Filip Skogh, Ahmed Ali-Eldin, David Irwin, Prashant Shenoy	2024-03-21	下载	There has been a significant societal push towards sustainable practices, including in computing. Modern interactive workloads such as geo-distributed web-services exhibit various spatiotemporal and p...
History-Independent Concurrent Objects	Hagit Attiya, Michael A. Bender, Martin Farach-Colton, Rotem Oshman, Noa Schiller	2024-03-21	下载	A data structure is called history independent if its internal memory representation does not reveal the history of operations applied to it, only its current state.
FedMef: Towards Memory-efficient Federated Dynamic Pruning	Hong Huang, Weiming Zhuang, Chen Chen, Lingjuan Lyu	2024-03-21	下载	Federated learning (FL) promotes decentralized training while prioritizing data confidentiality. However, its application on resource-constrained devices is challenging due to the high demand for comp...
Loop Improvement: An Efficient Approach for Extracting Shared Features from Heterogeneous Data without Central Server	Fei Li, Chu Kiong Loo, Wei Shiung Liew, Xiaofeng Liu	2024-03-21	下载	In federated learning, data heterogeneity significantly impacts performance. A typical solution involves segregating these parameters into shared and personalized components, a concept also relevant i...
Accelerating Time-to-Science by Streaming Detector Data Directly into Perlmutter Compute Nodes	Samuel S. Welborn, Bjoern Enders, Chris Harris, Peter Ercius, Deborah J. Bard	2024-03-21	下载	Recent advancements in detector technology have significantly increased the size and complexity of experimental data, and high-performance computing (HPC) provides a path towards more efficient and ti...
Adversary-Augmented Simulation to evaluate fairness on HyperLedger Fabric	Erwan Mahe, Rouwaida Abdallah, Sara Tucci-Piergiovanni, Pierre-Yves Piriou	2024-03-21	下载	This paper presents a novel adversary model specifically tailored to distributed systems, aiming to assess the security of blockchain networks.
Towards Single Slot Finality: Evaluating Consensus Mechanisms and Methods for Faster Ethereum Finality	Lincoln Murr	2024-03-21	下载	Ethereum's current Gasper consensus mechanism, which combines the Latest Message Driven Greediest Heaviest Observed SubTree (LMD-GHOST) fork choice rule with the probabilistic Casper the Friendly Fina...
AI and Memory Wall	Amir Gholami, Zhewei Yao, Sehoon Kim, Coleman Hooper, Michael W. Mahoney, Kurt Keutzer	2024-03-21	下载	The availability of unprecedented unsupervised training data, along with neural scaling laws, has resulted in an unprecedented surge in model size and compute requirements for serving/training LLMs.
On the Power of Quantum Distributed Proofs	Atsuya Hasegawa, Srijita Kundu, Harumichi Nishimura	2024-03-21	下载	Quantum nondeterministic distributed computing was recently introduced as dQMA (distributed quantum Merlin-Arthur) protocols by Fraigniaud, Le Gall, Nishimura and Paz (ITCS 2021).
Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances	Jiangfei Duan, Ziang Song, Xupeng Miao, Xiaoli Xi, Dahua Lin, Harry Xu, Minjia Zhang, Zhihao Jia	2024-03-21	下载	Deep neural networks (DNNs) are becoming progressively large and costly to train. This paper aims to reduce DNN training costs by leveraging preemptible instances on modern clouds, which can be alloca...
Accelerating ViT Inference on FPGA through Static and Dynamic Pruning	Dhruv Parikh, Shouyi Li, Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna	2024-03-21	下载	Vision Transformers (ViTs) have achieved state-of-the-art accuracy on various computer vision tasks. However, their high computational complexity prevents them from being applied to many real-world ap...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
CASPER: Carbon-Aware Scheduling and Provisioning for Distributed Web Services	Abel Souza, Shruti Jasoria, Basundhara Chakrabarty, Alexander Bridgwater, Axel Lundberg, Filip Skogh, Ahmed Ali-Eldin, David Irwin, Prashant Shenoy	2024-03-21	下载	There has been a significant societal push towards sustainable practices, including in computing. Modern interactive workloads such as geo-distributed web-services exhibit various spatiotemporal and p...
Optimizing queues with deadlines under infrequent monitoring	Faraz Farahvash, Ao Tang	2024-03-21	下载	In this paper, we aim to improve the percentage of packets meeting their deadline in discrete-time M/M/1 queues with infrequent monitoring. More specifically, we look into policies that only monitor t...
A Mathematical Introduction to Deep Reinforcement Learning for 5G/6G Applications	Farhad Rezazadeh	2024-03-21	下载	Algorithmic innovation can unleash the potential of the beyond 5G (B5G)/6G communication systems. Artificial intelligence (AI)-driven zero-touch network slicing is envisaged as a promising cutting-edg...
Accelerating Time-to-Science by Streaming Detector Data Directly into Perlmutter Compute Nodes	Samuel S. Welborn, Bjoern Enders, Chris Harris, Peter Ercius, Deborah J. Bard	2024-03-21	下载	Recent advancements in detector technology have significantly increased the size and complexity of experimental data, and high-performance computing (HPC) provides a path towards more efficient and ti...

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse Operations	Md Saidul Hoque Anik, Pranav Badhe, Rohit Gampa, Ariful Azad	2024-03-21	下载	Core computations in Graph Neural Network (GNN) training and inference are often mapped to sparse matrix operations such as sparse-dense matrix multiplication (SpMM).
CASPER: Carbon-Aware Scheduling and Provisioning for Distributed Web Services	Abel Souza, Shruti Jasoria, Basundhara Chakrabarty, Alexander Bridgwater, Axel Lundberg, Filip Skogh, Ahmed Ali-Eldin, David Irwin, Prashant Shenoy	2024-03-21	下载	There has been a significant societal push towards sustainable practices, including in computing. Modern interactive workloads such as geo-distributed web-services exhibit various spatiotemporal and p...