Skip to content

2024-01-09

cs.AR - Architecture

标题作者发布日期PDF摘要
Testing Spintronics Implemented Monte Carlo Dropout-Based Bayesian Neural NetworksSoyed Tuhin Ahmed, Michael Hefenbrock, Guillaume Prenat, Lorena Anghel, Mehdi B. Tahoori2024-01-09下载Bayesian Neural Networks (BayNNs) can inherently estimate predictive uncertainty, facilitating informed decision-making. Dropout-based BayNNs are increasingly implemented in spintronics-based computat...
WebGPU-SPY: Finding Fingerprints in the Sandbox through GPU Cache AttacksEthan Ferguson, Adam Wilson, Hoda Naghibijouybari2024-01-09下载Microarchitectural attacks on CPU structures have been studied in native applications, as well as in web browsers. These attacks continue to be a substantial threat to computing systems at all scales.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
CoNST: Code Generator for Sparse Tensor NetworksSaurabh Raje, Yufan Xu, Atanas Rountev, Edward F. Valeev, Saday Sadayappan2024-01-09下载Sparse tensor networks are commonly used to represent contractions over sparse tensors. Tensor contractions are higher-order analogs of matrix multiplication.
HiRace: Accurate and Fast Source-Level Race Checking of GPU ProgramsJohn Jacobson, Martin Burtscher, Ganesh Gopalakrishnan2024-01-09下载Data races are egregious parallel programming bugs on CPUs. They are even worse on GPUs due to the hierarchical thread and memory structure, which makes it possible to write code that is correctly syn...
XaaS: Acceleration as a Service to Enable Productive High-Performance Cloud ComputingTorsten Hoefler, Marcin Copik, Pete Beckman, Andrew Jones, Ian Foster, Manish Parashar, Daniel Reed, Matthias Troyer, Thomas Schulthess, Dan Ernst, Jack Dongarra2024-01-09下载HPC and Cloud have evolved independently, specializing their innovations into performance or productivity. Acceleration as a Service (XaaS) is a recipe to empower both fields with a shared execution p...
Adaptive Asynchronous Work-Stealing for distributed load-balancing in heterogeneous systemsJoão B. Fernandes, Ítalo A. S. de Assis, Idalmis M. S. Martins, Tiago Barros, Samuel Xavier-de-Souza2024-01-09下载Supercomputers have revolutionized how industries and scientific fields process large amounts of data. These machines group hundreds or thousands of computing nodes working together to execute time-co...
A Survey on Efficient Federated Learning Methods for Foundation Model TrainingHerbert Woisetschläger, Alexander Isenko, Shiqiang Wang, Ruben Mayer, Hans-Arno Jacobsen2024-01-09下载Federated Learning (FL) has become an established technique to facilitate privacy-preserving collaborative training across a multitude of clients.
G-Meta: Distributed Meta Learning in GPU Clusters for Large-Scale Recommender SystemsYoushao Xiao, Shangchun Zhao, Zhenglei Zhou, Zhaoxin Huan, Lin Ju, Xiaolu Zhang, Lin Wang, Jun Zhou2024-01-09下载Recently, a new paradigm, meta learning, has been widely applied to Deep Learning Recommendation Models (DLRM) and significantly improves statistical performance, especially in cold-start scenarios.
Expiring Assets in Automated Market MakersKenan Wood, Maurice Herlihy, Hammurabi Mendes, Jonad Pulaj2024-01-09下载An automated market maker (AMM) is a state machine that manages pools of assets, allowing parties to buy and sell those assets according to a fixed mathematical formula.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
FairQ: Fair and Fast Rate Allocation in Data CentersAhmed M. Abdelmoniem, Brahim Bensaou2024-01-09下载The peculiar congestion patterns in data centers are caused by the bursty and composite nature of traffic, the small bandwidth-delay product, and the tiny switch buffers.
T-PRIME: Transformer-based Protocol Identification for Machine-learning at the EdgeMauro Belgiovine, Joshua Groen, Miquel Sirera, Chinenye Tassie, Ayberk Yarkın Yıldız, Sage Trudeau, Stratis Ioannidis, Kaushik Chowdhury2024-01-09下载Spectrum sharing allows different protocols of the same standard (e.g., 802.11 family) or different standards (e.g., LTE and DVB) to coexist in overlapping frequency bands.
DeepSweep: Parallel and Scalable Spectrum Sensing via Convolutional Neural NetworksClifton Paul Robinson, Daniel Uvaydov, Salvatore D'Oro, Tommaso Melodia2024-01-09下载Spectrum sensing is an essential component of modern wireless networks as it offers a tool to characterize spectrum usage and better utilize it.
A Novel OMNeT++-based Simulation Tool for Vehicular Cloud Computing in ETSI MEC-compliant 5G EnvironmentsAngelo Feraudo, Alessandro Calvio, Paolo Bellavista2024-01-09下载Vehicular cloud computing is gaining popularity thanks to the rapid advancements in next generation wireless communication networks. Similarly, Edge Computing, along with its standard proposals such a...
A Novel Framework of K-repetition Grant-free Access via Diversity Slotted Aloha (DSA)Haoran Mei, Limei Peng, Pin-Han Ho2024-01-09下载This article introduces a novel framework of multi-user detection (MUD) for K-repetition grant-free non-orthogonal multiple access (K-GF-NOMA), called α iterative interference cancellation diversity...
Reconfigurable Intelligent Surface Constructing 6G Near-Field NetworksYajun Zhao2024-01-09下载Near-field propagation, particularly that enabled by reconfigurable intelligent surfaces (RIS), has emerged as a promising research topic in recent years.
eSIM Technology in IoT ArchitectureHang Yuan, Artiom Baloian, Jan Janak, Henning Schulzrinne2024-01-09下载eSIM(embedded SIM) is an advanced alternative to traditional physical SIM cards initially developed by the GSM Association(GSMA) in 2013 [1][2].

cs.PF - Performance

标题作者发布日期PDF摘要
CoNST: Code Generator for Sparse Tensor NetworksSaurabh Raje, Yufan Xu, Atanas Rountev, Edward F. Valeev, Saday Sadayappan2024-01-09下载Sparse tensor networks are commonly used to represent contractions over sparse tensors. Tensor contractions are higher-order analogs of matrix multiplication.
Approximation Algorithms for Minimizing Congestion in Demand-Aware NetworksWenkai Dai, Michael Dinitz, Klaus-Tycho Foerster, Long Luo, Stefan Schmid2024-01-09下载Emerging reconfigurable optical communication technologies allow to enhance datacenter topologies with demand-aware links optimized towards traffic patterns.
Identifying Best Practice Melting Patterns in Induction Furnaces: A Data-Driven Approach Using Time Series KMeans Clustering and Multi-Criteria Decision MakingDaniel Anthony Howard, Bo Nørregaard Jørgensen, Zheng Ma2024-01-09下载Improving energy efficiency in industrial production processes is crucial for competitiveness, and compliance with climate policies. This paper introduces a data-driven approach to identify optimal me...
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-InferenceConnor Holmes, Masahiro Tanaka, Michael Wyatt, Ammar Ahmad Awan, Jeff Rasley, Samyam Rajbhandari, Reza Yazdani Aminabadi, Heyang Qin, Arash Bakhtiari, Lev Kurilenko, Yuxiong He2024-01-09下载The deployment and scaling of large language models (LLMs) have become critical as they permeate various applications, demanding high-throughput and low-latency serving systems.
Online Allocation with Replenishable Budgets: Worst Case and BeyondJianyi Yang, Pengfei Li, Mohammad Jaminur Islam, Shaolei Ren2024-01-09下载This paper studies online resource allocation with replenishable budgets, where budgets can be replenished on top of the initial budget and an agent sequentially chooses online allocation decisions wi...

基于 VitePress 构建