2024-01-09

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Testing Spintronics Implemented Monte Carlo Dropout-Based Bayesian Neural Networks	Soyed Tuhin Ahmed, Michael Hefenbrock, Guillaume Prenat, Lorena Anghel, Mehdi B. Tahoori	2024-01-09	下载	Bayesian Neural Networks (BayNNs) can inherently estimate predictive uncertainty, facilitating informed decision-making. Dropout-based BayNNs are increasingly implemented in spintronics-based computat...
WebGPU-SPY: Finding Fingerprints in the Sandbox through GPU Cache Attacks	Ethan Ferguson, Adam Wilson, Hoda Naghibijouybari	2024-01-09	下载	Microarchitectural attacks on CPU structures have been studied in native applications, as well as in web browsers. These attacks continue to be a substantial threat to computing systems at all scales.

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
CoNST: Code Generator for Sparse Tensor Networks	Saurabh Raje, Yufan Xu, Atanas Rountev, Edward F. Valeev, Saday Sadayappan	2024-01-09	下载	Sparse tensor networks are commonly used to represent contractions over sparse tensors. Tensor contractions are higher-order analogs of matrix multiplication.
HiRace: Accurate and Fast Source-Level Race Checking of GPU Programs	John Jacobson, Martin Burtscher, Ganesh Gopalakrishnan	2024-01-09	下载	Data races are egregious parallel programming bugs on CPUs. They are even worse on GPUs due to the hierarchical thread and memory structure, which makes it possible to write code that is correctly syn...
XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud Computing	Torsten Hoefler, Marcin Copik, Pete Beckman, Andrew Jones, Ian Foster, Manish Parashar, Daniel Reed, Matthias Troyer, Thomas Schulthess, Dan Ernst, Jack Dongarra	2024-01-09	下载	HPC and Cloud have evolved independently, specializing their innovations into performance or productivity. Acceleration as a Service (XaaS) is a recipe to empower both fields with a shared execution p...
Adaptive Asynchronous Work-Stealing for distributed load-balancing in heterogeneous systems	João B. Fernandes, Ítalo A. S. de Assis, Idalmis M. S. Martins, Tiago Barros, Samuel Xavier-de-Souza	2024-01-09	下载	Supercomputers have revolutionized how industries and scientific fields process large amounts of data. These machines group hundreds or thousands of computing nodes working together to execute time-co...
A Survey on Efficient Federated Learning Methods for Foundation Model Training	Herbert Woisetschläger, Alexander Isenko, Shiqiang Wang, Ruben Mayer, Hans-Arno Jacobsen	2024-01-09	下载	Federated Learning (FL) has become an established technique to facilitate privacy-preserving collaborative training across a multitude of clients.
G-Meta: Distributed Meta Learning in GPU Clusters for Large-Scale Recommender Systems	Youshao Xiao, Shangchun Zhao, Zhenglei Zhou, Zhaoxin Huan, Lin Ju, Xiaolu Zhang, Lin Wang, Jun Zhou	2024-01-09	下载	Recently, a new paradigm, meta learning, has been widely applied to Deep Learning Recommendation Models (DLRM) and significantly improves statistical performance, especially in cold-start scenarios.
Expiring Assets in Automated Market Makers	Kenan Wood, Maurice Herlihy, Hammurabi Mendes, Jonad Pulaj	2024-01-09	下载	An automated market maker (AMM) is a state machine that manages pools of assets, allowing parties to buy and sell those assets according to a fixed mathematical formula.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
FairQ: Fair and Fast Rate Allocation in Data Centers	Ahmed M. Abdelmoniem, Brahim Bensaou	2024-01-09	下载	The peculiar congestion patterns in data centers are caused by the bursty and composite nature of traffic, the small bandwidth-delay product, and the tiny switch buffers.
T-PRIME: Transformer-based Protocol Identification for Machine-learning at the Edge	Mauro Belgiovine, Joshua Groen, Miquel Sirera, Chinenye Tassie, Ayberk Yarkın Yıldız, Sage Trudeau, Stratis Ioannidis, Kaushik Chowdhury	2024-01-09	下载	Spectrum sharing allows different protocols of the same standard (e.g., 802.11 family) or different standards (e.g., LTE and DVB) to coexist in overlapping frequency bands.
DeepSweep: Parallel and Scalable Spectrum Sensing via Convolutional Neural Networks	Clifton Paul Robinson, Daniel Uvaydov, Salvatore D'Oro, Tommaso Melodia	2024-01-09	下载	Spectrum sensing is an essential component of modern wireless networks as it offers a tool to characterize spectrum usage and better utilize it.
A Novel OMNeT++-based Simulation Tool for Vehicular Cloud Computing in ETSI MEC-compliant 5G Environments	Angelo Feraudo, Alessandro Calvio, Paolo Bellavista	2024-01-09	下载	Vehicular cloud computing is gaining popularity thanks to the rapid advancements in next generation wireless communication networks. Similarly, Edge Computing, along with its standard proposals such a...
A Novel Framework of K-repetition Grant-free Access via Diversity Slotted Aloha (DSA)	Haoran Mei, Limei Peng, Pin-Han Ho	2024-01-09	下载	This article introduces a novel framework of multi-user detection (MUD) for K-repetition grant-free non-orthogonal multiple access (K-GF-NOMA), called α iterative interference cancellation diversity...
Reconfigurable Intelligent Surface Constructing 6G Near-Field Networks	Yajun Zhao	2024-01-09	下载	Near-field propagation, particularly that enabled by reconfigurable intelligent surfaces (RIS), has emerged as a promising research topic in recent years.
eSIM Technology in IoT Architecture	Hang Yuan, Artiom Baloian, Jan Janak, Henning Schulzrinne	2024-01-09	下载	eSIM(embedded SIM) is an advanced alternative to traditional physical SIM cards initially developed by the GSM Association(GSMA) in 2013 [1][2].

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
CoNST: Code Generator for Sparse Tensor Networks	Saurabh Raje, Yufan Xu, Atanas Rountev, Edward F. Valeev, Saday Sadayappan	2024-01-09	下载	Sparse tensor networks are commonly used to represent contractions over sparse tensors. Tensor contractions are higher-order analogs of matrix multiplication.
Approximation Algorithms for Minimizing Congestion in Demand-Aware Networks	Wenkai Dai, Michael Dinitz, Klaus-Tycho Foerster, Long Luo, Stefan Schmid	2024-01-09	下载	Emerging reconfigurable optical communication technologies allow to enhance datacenter topologies with demand-aware links optimized towards traffic patterns.
Identifying Best Practice Melting Patterns in Induction Furnaces: A Data-Driven Approach Using Time Series KMeans Clustering and Multi-Criteria Decision Making	Daniel Anthony Howard, Bo Nørregaard Jørgensen, Zheng Ma	2024-01-09	下载	Improving energy efficiency in industrial production processes is crucial for competitiveness, and compliance with climate policies. This paper introduces a data-driven approach to identify optimal me...
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference	Connor Holmes, Masahiro Tanaka, Michael Wyatt, Ammar Ahmad Awan, Jeff Rasley, Samyam Rajbhandari, Reza Yazdani Aminabadi, Heyang Qin, Arash Bakhtiari, Lev Kurilenko, Yuxiong He	2024-01-09	下载	The deployment and scaling of large language models (LLMs) have become critical as they permeate various applications, demanding high-throughput and low-latency serving systems.
Online Allocation with Replenishable Budgets: Worst Case and Beyond	Jianyi Yang, Pengfei Li, Mohammad Jaminur Islam, Shaolei Ren	2024-01-09	下载	This paper studies online resource allocation with replenishable budgets, where budgets can be replenished on top of the initial budget and an agent sequentially chooses online allocation decisions wi...