Appearance
2024-03-21
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Beehive: A Flexible Network Stack for Direct-Attached Accelerators | Katie Lim, Matthew Giordano, Theano Stavrinos, Irene Zhang, Jacob Nelson, Baris Kasikci, Tom Anderson | 2024-03-21 | 下载 | Direct-attached accelerators, where application accelerators are directly connected to the datacenter network via a hardware network stack, offer substantial benefits in terms of reduced latency, CPU ... |
| DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video Analytics | Yoonsung Kim, Changhun Oh, Jinwoo Hwang, Wonung Kim, Seongryong Oh, Yubin Lee, Hardik Sharma, Amir Yazdanbakhsh, Jongse Park | 2024-03-21 | 下载 | Deep neural network (DNN) video analytics is crucial for autonomous systems such as self-driving vehicles, unmanned aerial vehicles (UAVs), and security robots. |
| E-Syn: E-Graph Rewriting with Technology-Aware Cost Functions for Logic Synthesis | Chen Chen, Guangyu Hu, Dongsheng Zuo, Cunxi Yu, Yuzhe Ma, Hongce Zhang | 2024-03-21 | 下载 | Logic synthesis plays a crucial role in the digital design flow. It has a decisive influence on the final Quality of Results (QoR) of the circuit implementations. |
| AI and Memory Wall | Amir Gholami, Zhewei Yao, Sehoon Kim, Coleman Hooper, Michael W. Mahoney, Kurt Keutzer | 2024-03-21 | 下载 | The availability of unprecedented unsupervised training data, along with neural scaling laws, has resulted in an unprecedented surge in model size and compute requirements for serving/training LLMs. |
| Accelerating ViT Inference on FPGA through Static and Dynamic Pruning | Dhruv Parikh, Shouyi Li, Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna | 2024-03-21 | 下载 | Vision Transformers (ViTs) have achieved state-of-the-art accuracy on various computer vision tasks. However, their high computational complexity prevents them from being applied to many real-world ap... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Blockchain e Sistemas Distribuídos: conceitos básicos e implicações | M. Witter, A. Rodrigo De Vit | 2024-03-21 | 下载 | Blockchain technology has emerged as a necessity for the decentralization of payment methods and transactions, but it has brought with it many properties of distributed systems that have made it a cru... |
| iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse Operations | Md Saidul Hoque Anik, Pranav Badhe, Rohit Gampa, Ariful Azad | 2024-03-21 | 下载 | Core computations in Graph Neural Network (GNN) training and inference are often mapped to sparse matrix operations such as sparse-dense matrix multiplication (SpMM). |
| CASPER: Carbon-Aware Scheduling and Provisioning for Distributed Web Services | Abel Souza, Shruti Jasoria, Basundhara Chakrabarty, Alexander Bridgwater, Axel Lundberg, Filip Skogh, Ahmed Ali-Eldin, David Irwin, Prashant Shenoy | 2024-03-21 | 下载 | There has been a significant societal push towards sustainable practices, including in computing. Modern interactive workloads such as geo-distributed web-services exhibit various spatiotemporal and p... |
| History-Independent Concurrent Objects | Hagit Attiya, Michael A. Bender, Martin Farach-Colton, Rotem Oshman, Noa Schiller | 2024-03-21 | 下载 | A data structure is called history independent if its internal memory representation does not reveal the history of operations applied to it, only its current state. |
| FedMef: Towards Memory-efficient Federated Dynamic Pruning | Hong Huang, Weiming Zhuang, Chen Chen, Lingjuan Lyu | 2024-03-21 | 下载 | Federated learning (FL) promotes decentralized training while prioritizing data confidentiality. However, its application on resource-constrained devices is challenging due to the high demand for comp... |
| Loop Improvement: An Efficient Approach for Extracting Shared Features from Heterogeneous Data without Central Server | Fei Li, Chu Kiong Loo, Wei Shiung Liew, Xiaofeng Liu | 2024-03-21 | 下载 | In federated learning, data heterogeneity significantly impacts performance. A typical solution involves segregating these parameters into shared and personalized components, a concept also relevant i... |
| Accelerating Time-to-Science by Streaming Detector Data Directly into Perlmutter Compute Nodes | Samuel S. Welborn, Bjoern Enders, Chris Harris, Peter Ercius, Deborah J. Bard | 2024-03-21 | 下载 | Recent advancements in detector technology have significantly increased the size and complexity of experimental data, and high-performance computing (HPC) provides a path towards more efficient and ti... |
| Adversary-Augmented Simulation to evaluate fairness on HyperLedger Fabric | Erwan Mahe, Rouwaida Abdallah, Sara Tucci-Piergiovanni, Pierre-Yves Piriou | 2024-03-21 | 下载 | This paper presents a novel adversary model specifically tailored to distributed systems, aiming to assess the security of blockchain networks. |
| Towards Single Slot Finality: Evaluating Consensus Mechanisms and Methods for Faster Ethereum Finality | Lincoln Murr | 2024-03-21 | 下载 | Ethereum's current Gasper consensus mechanism, which combines the Latest Message Driven Greediest Heaviest Observed SubTree (LMD-GHOST) fork choice rule with the probabilistic Casper the Friendly Fina... |
| AI and Memory Wall | Amir Gholami, Zhewei Yao, Sehoon Kim, Coleman Hooper, Michael W. Mahoney, Kurt Keutzer | 2024-03-21 | 下载 | The availability of unprecedented unsupervised training data, along with neural scaling laws, has resulted in an unprecedented surge in model size and compute requirements for serving/training LLMs. |
| On the Power of Quantum Distributed Proofs | Atsuya Hasegawa, Srijita Kundu, Harumichi Nishimura | 2024-03-21 | 下载 | Quantum nondeterministic distributed computing was recently introduced as dQMA (distributed quantum Merlin-Arthur) protocols by Fraigniaud, Le Gall, Nishimura and Paz (ITCS 2021). |
| Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances | Jiangfei Duan, Ziang Song, Xupeng Miao, Xiaoli Xi, Dahua Lin, Harry Xu, Minjia Zhang, Zhihao Jia | 2024-03-21 | 下载 | Deep neural networks (DNNs) are becoming progressively large and costly to train. This paper aims to reduce DNN training costs by leveraging preemptible instances on modern clouds, which can be alloca... |
| Accelerating ViT Inference on FPGA through Static and Dynamic Pruning | Dhruv Parikh, Shouyi Li, Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna | 2024-03-21 | 下载 | Vision Transformers (ViTs) have achieved state-of-the-art accuracy on various computer vision tasks. However, their high computational complexity prevents them from being applied to many real-world ap... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| CASPER: Carbon-Aware Scheduling and Provisioning for Distributed Web Services | Abel Souza, Shruti Jasoria, Basundhara Chakrabarty, Alexander Bridgwater, Axel Lundberg, Filip Skogh, Ahmed Ali-Eldin, David Irwin, Prashant Shenoy | 2024-03-21 | 下载 | There has been a significant societal push towards sustainable practices, including in computing. Modern interactive workloads such as geo-distributed web-services exhibit various spatiotemporal and p... |
| Optimizing queues with deadlines under infrequent monitoring | Faraz Farahvash, Ao Tang | 2024-03-21 | 下载 | In this paper, we aim to improve the percentage of packets meeting their deadline in discrete-time M/M/1 queues with infrequent monitoring. More specifically, we look into policies that only monitor t... |
| A Mathematical Introduction to Deep Reinforcement Learning for 5G/6G Applications | Farhad Rezazadeh | 2024-03-21 | 下载 | Algorithmic innovation can unleash the potential of the beyond 5G (B5G)/6G communication systems. Artificial intelligence (AI)-driven zero-touch network slicing is envisaged as a promising cutting-edg... |
| Accelerating Time-to-Science by Streaming Detector Data Directly into Perlmutter Compute Nodes | Samuel S. Welborn, Bjoern Enders, Chris Harris, Peter Ercius, Deborah J. Bard | 2024-03-21 | 下载 | Recent advancements in detector technology have significantly increased the size and complexity of experimental data, and high-performance computing (HPC) provides a path towards more efficient and ti... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse Operations | Md Saidul Hoque Anik, Pranav Badhe, Rohit Gampa, Ariful Azad | 2024-03-21 | 下载 | Core computations in Graph Neural Network (GNN) training and inference are often mapped to sparse matrix operations such as sparse-dense matrix multiplication (SpMM). |
| CASPER: Carbon-Aware Scheduling and Provisioning for Distributed Web Services | Abel Souza, Shruti Jasoria, Basundhara Chakrabarty, Alexander Bridgwater, Axel Lundberg, Filip Skogh, Ahmed Ali-Eldin, David Irwin, Prashant Shenoy | 2024-03-21 | 下载 | There has been a significant societal push towards sustainable practices, including in computing. Modern interactive workloads such as geo-distributed web-services exhibit various spatiotemporal and p... |