Skip to content

2024-03-01

cs.AR - Architecture

标题作者发布日期PDF摘要
NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM InferencingGuseul Heo, Sangyeop Lee, Jaehong Cho, Hyunmin Choi, Sanghyeon Lee, Hyungkyu Ham, Gwangsun Kim, Divya Mahajan, Jongse Park2024-03-01下载Modern transformer-based Large Language Models (LLMs) are constructed with a series of decoder blocks. Each block comprises three key components: (1) QKV generation, (2) multi-head attention, and (3) ...
Attacking Delay-based PUFs with Minimal Adversary ModelHongming Fei, Owen Millwood, Prosanta Gope, Jack Miskelly, Biplab Sikdar2024-03-01下载Physically Unclonable Functions (PUFs) provide a streamlined solution for lightweight device authentication. Delay-based Arbiter PUFs, with their ease of implementation and vast challenge space, have ...
SFQ counter-based precomputation for large-scale cryogenic VQE machinesYosuke Ueno, Satoshi Imamura, Yuna Tomida, Teruo Tanimoto, Masamitsu Tanaka, Yutaka Tabuchi, Koji Inoue, Hiroshi Nakamura2024-03-01下载The variational quantum eigensolver (VQE) is a promising candidate that brings practical benefits from quantum computing. However, the required bandwidth in/out of a cryostat is a limiting factor to s...
FTTN: Feature-Targeted Testing for Numerical Properties of NVIDIA & AMD Matrix AcceleratorsXinyi Li, Ang Li, Bo Fang, Katarzyna Swirydowicz, Ignacio Laguna, Ganesh Gopalakrishnan2024-03-01下载NVIDIA Tensor Cores and AMD Matrix Cores (together called Matrix Accelerators) are of growing interest in high-performance computing and machine learning owing to their high performance.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
A Sufficient Epistemic Condition for Solving Stabilizing AgreementGiorgio Cignarale, Stephan Felber, Hugo Rincon Galeana2024-03-01下载In this paper we provide a first-ever epistemic formulation of stabilizing agreement, defined as the non-terminating variant of the well established consensus problem.
A Spark Optimizer for Adaptive, Fine-Grained Parameter TuningChenghao Lyu, Qi Fan, Philippe Guyard, Yanlei Diao2024-03-01下载As Spark becomes a common big data analytics platform, its growing complexity makes automatic tuning of numerous parameters critical for performance.
Neural Acceleration of Incomplete Cholesky PreconditionersJoshua Dennis Booth, Hongyang Sun, Trevor Garnett2024-03-01下载The solution of a sparse system of linear equations is ubiquitous in scientific applications. Iterative methods, such as the Preconditioned Conjugate Gradient method (PCG), are normally chosen over di...
Containerization in Multi-Cloud Environment: Roles, Strategies, Challenges, and Solutions for Effective ImplementationMuhammad Waseem, Aakash Ahmad, Peng Liang, Muhammad Azeem Akbar, Arif Ali Khan, Iftikhar Ahmad, Manu Setälä, Tommi Mikkonen2024-03-01下载Containerization in multi-cloud environments has received significant attention in recent years both from academic research and industrial development perspectives.
Are Unikernels Ready for Serverless on the Edge?Felix Moebius, Tobias Pfandzelter, David Bermbach2024-03-01下载Function-as-a-Service (FaaS) is a promising edge computing execution model but requires secure sandboxing mechanisms to isolate workloads from multiple tenants on constrained infrastructure.
Jiagu: Optimizing Serverless Computing Resource Utilization with Harmonized Efficiency and PracticabilityQingyuan Liu, Yanning Yang, Dong Du, Yubin Xia, Ping Zhang, Jia Feng, James Larus, Haibo Chen2024-03-01下载Current serverless platforms struggle to optimize resource utilization due to their dynamic and fine-grained nature. Conventional techniques like overcommitment and autoscaling fall short, often sacri...
Training Computer Scientists for the Challenges of Hybrid Quantum-Classical ComputingVincenzo De Maio, Meerzhan Kanatbekova, Felix Zilk, Nicolai Friis, Tobias Guggemos, Ivona Brandic2024-03-01下载As we enter the post-Moore era, we experience the rise of various non-von-Neumann-architectures to address the increasing computational demand for modern applications, with quantum computing being amo...
FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA TransmissionZeling Zhang, Dongqi Cai, Yiran Zhang, Mengwei Xu, Shangguang Wang, Ao Zhou2024-03-01下载Communication overhead is a significant bottleneck in federated learning (FL), which has been exaggerated with the increasing size of AI models.
Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale RecommendationLiang Luo, Buyun Zhang, Michael Tsang, Yinbin Ma, Ching-Hsiang Chu, Yuxin Chen, Shen Li, Yuchen Hao, Yanli Zhao, Guna Lakshminarayanan, Ellie Dingqiao Wen, Jongsoo Park, Dheevatsa Mudigere, Maxim Naumov2024-03-01下载We study a mismatch between the deep learning recommendation models' flat architecture, common distributed training paradigm and hierarchical data center topology.
WindGP: Efficient Graph Partitioning on Heterogenous MachinesLi Zeng, Haohan Huang, Binfan Zheng, Kang Yang, Shengcheng Shao, Jinhua Zhou, Jun Xie, Rongqian Zhao, Xin Chen2024-03-01下载Graph Partitioning is widely used in many real-world applications such as fraud detection and social network analysis, in order to enable the distributed graph computing on large graphs.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Design and Performance Evaluation of SEANet, a Software-defined Networking Platform for the Internet of Underwater ThingsDeniz Unal, Sara Falleni, Kerem Enhos, Emrecan Demirors, Stefano Basagni, Tommaso Melodia2024-03-01下载This paper presents the design and performance evaluation of the SEANet platform, a software-defined acoustic modem designed for enhancing underwater networking and Internet of Underwater Things (IoUT...
Toward Autonomous Cooperation in Heterogeneous Nanosatellite Constellations Using Dynamic Graph Neural NetworksGuillem Casadesus-Vila, Joan-Adria Ruiz-de-Azua, Eduard Alarcon2024-03-01下载The upcoming landscape of Earth Observation missions will defined by networked heterogeneous nanosatellite constellations required to meet strict mission requirements, such as revisit times and spatia...
Exploring Upper-6GHz and mmWave in Real-World 5G Networks: A Direct on-Field ComparisonMarcello Morini, Eugenio Moro, Ilario Filippini, Antonio Capone, Danilo De Donno2024-03-01下载The spectrum crunch challenge poses a vital threat to the progress of cellular networks and recently prompted the inclusion of millimeter wave (mmWave) and Upper 6GHz (U6G) in the 3GPP standards.
Hercules: Heterogeneous Requirements Congestion Control ProtocolNeta Rozen-Schiff, Itzcak Pechtalt, Amit Navon, Leon Bruckman2024-03-01下载Future network services present a significant challenge for network providers due to high number and high variety of co-existing requirements.
Comparative Study of Simulators for Vehicular NetworksRida Saghir, Thenuka Karunathilake, Anna Förster2024-03-01下载Vehicular Adhoc networks (VANETs) are composed of vehicles connected with wireless links to exchange data. VANETs have become the backbone of the Intelligent Transportation Systems (ITS) in smart citi...
FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA TransmissionZeling Zhang, Dongqi Cai, Yiran Zhang, Mengwei Xu, Shangguang Wang, Ao Zhou2024-03-01下载Communication overhead is a significant bottleneck in federated learning (FL), which has been exaggerated with the increasing size of AI models.
Yodel: A Layer 3.5 Name-Based Multicast Network Architecture For The Future InternetMorteza Moghaddassian, Alberto Leon-Garcia2024-03-01下载Multicasting refers to the ability of transmitting data to multiple recipients without data sources needing to provide more than one copy of the data to the network.

cs.PF - Performance

标题作者发布日期PDF摘要
An Experimental Study of Low-Latency Video Streaming over 5GImran Khan, Tuyen X. Tran, Matti Hiltunen, Theodore Karagioules, Dimitrios Koutsonikolas2024-03-01下载Low-latency video streaming over 5G has become rapidly popular over the last few years due to its increased usage in hosting virtual events, online education, webinars, and all-hands meetings.

基于 VitePress 构建