Skip to content

2024-12-18

cs.AR - Architecture

标题作者发布日期PDF摘要
Optimizing ML Concurrent Computation and Communication with GPU DMA EnginesAnirudha Agrawal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam2024-12-18下载Concurrent computation and communication (C3) is a pervasive paradigm in ML and other domains, making its performance optimization crucial. In this paper, we carefully characterize C3 in ML on GPUs, w...
Temperature-Resilient Analog Neuromorphic Chip in Single-Polysilicon CMOS TechnologyTommaso Rizzo, Sebastiano Strangio, Alessandro Catania, Giuseppe Iannaccone2024-12-18下载In analog neuromorphic chips, designers can embed computing primitives in the intrinsic physical properties of devices and circuits, heavily reducing device count and energy consumption, and enabling ...
USEFUSE: Uniform Stride for Enhanced Performance in Fused Layer Architecture of Deep Neural NetworksMuhammad Sohail Ibrahim, Muhammad Usman, Jeong-A Lee2024-12-18下载Convolutional Neural Networks (CNNs) are crucial in various applications, but their deployment on resource-constrained edge devices poses challenges.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Scaling Deep Learning Training with MPMD Pipeline ParallelismAnxhelo Xhebraj, Sean Lee, Hanfeng Chen, Vinod Grover2024-12-18下载We present JaxPP, a system for efficiently scaling the training of large deep learning models with flexible pipeline parallelism. We introduce a seamless programming model that allows implementing use...
Optimizing ML Concurrent Computation and Communication with GPU DMA EnginesAnirudha Agrawal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam2024-12-18下载Concurrent computation and communication (C3) is a pervasive paradigm in ML and other domains, making its performance optimization crucial. In this paper, we carefully characterize C3 in ML on GPUs, w...
Split Learning in Computer Vision for Semantic Segmentation Delay MinimizationNikos G. Evgenidis, Nikos A. Mitsiou, Sotiris A. Tegos, Panagiotis D. Diamantoulakis, George K. Karagiannidis2024-12-18下载In this paper, we propose a novel approach to minimize the inference delay in semantic segmentation using split learning (SL), tailored to the needs of real-time computer vision (CV) applications for ...
Exploring User Acceptance of Blockchain-Based Student Certificate Sharing System: A Study on Non Fungible Token (NFT) UtilizationPrakhyat Khati, Ajay Kumar Shrestha, Julita Vassileva2024-12-18下载Blockchain technology has emerged as a transformative tool for data management in a variety of industries, including fintech, research and healthcare.
Fast Leaderless Byzantine Total Order BroadcastMatteo Monti, Martina Camaioni, Pierre-Louis Roman2024-12-18下载This paper presents the Byzantine fault-tolerant agreement protocols Flutter and Blink. Both algorithms are deterministic, leaderless and signature-free; both assume partial synchrony and at least $(5...
A Survey on Inference Optimization Techniques for Mixture of Experts ModelsJiacheng Liu, Peng Tang, Wenfeng Wang, Yuhang Ren, Xiaofeng Hou, Pheng-Ann Heng, Minyi Guo, Chao Li2024-12-18下载The emergence of large-scale Mixture of Experts (MoE) models represents a significant advancement in artificial intelligence, offering enhanced model capacity and computational efficiency through cond...
Unleashing the Power of Continual Learning on Non-Centralized Devices: A SurveyYichen Li, Haozhao Wang, Wenchao Xu, Tianzhe Xiao, Hong Liu, Minzhu Tu, Yuying Wang, Xin Yang, Rui Zhang, Shui Yu, Song Guo, Ruixuan Li2024-12-18下载Non-Centralized Continual Learning (NCCL) has become an emerging paradigm for enabling distributed devices such as vehicles and servers to handle streaming data from a joint non-stationary environment...
Rehearsal-Free Continual Federated Learning with Synergistic Synaptic IntelligenceYichen Li, Yuying Wang, Haozhao Wang, Yining Qi, Tianzhe Xiao, Ruixuan Li2024-12-18下载Continual Federated Learning (CFL) allows distributed devices to collaboratively learn novel concepts from continuously shifting training data while avoiding knowledge forgetting of previously seen ta...
SemiDFL: A Semi-Supervised Paradigm for Decentralized Federated LearningXinyang Liu, Pengchao Han, Xuan Li, Bo Liu2024-12-18下载Decentralized federated learning (DFL) realizes cooperative model training among connected clients without relying on a central server, thereby mitigating communication bottlenecks and eliminating the...
Communication-Efficient Personalized Federal Graph Learning via Low-Rank DecompositionRuyue Liu, Rong Yin, Xiangzhen Bo, Xiaoshuai Hao, Xingrui Zhou, Yong Liu, Can Ma, Weiping Wang2024-12-18下载Federated graph learning (FGL) has gained significant attention for enabling heterogeneous clients to process their private graph data locally while interacting with a centralized server, thus maintai...
Deploying Foundation Model Powered Agent Services: A SurveyWenchao Xu, Jinyu Chen, Peirong Zheng, Xiaoquan Yi, Tianyi Tian, Wenhui Zhu, Quan Wan, Haozhao Wang, Yunfeng Fan, Qinliang Su, Xuemin Shen2024-12-18下载Foundation model (FM) powered agent services are regarded as a promising solution to develop intelligent and personalized applications for advancing toward Artificial General Intelligence (AGI).

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
High-Accuracy and Efficient DV-Hop Localization for IoT Using Hop LossZhengdi Shen, Qiran Wang2024-12-18下载Accurate localization is critical for Internet of Things (IoT) applications. Using hop loss in DV-Hop-based algorithms is a promising approach.
Learning and Reconstructing Conflicts in O-RAN: A Graph Neural Network ApproachArshia Zolghadr, Joao F. Santos, Luiz A. DaSilva, Jacek Kibiłda2024-12-18下载The Open Radio Access Network (O-RAN) architecture enables the deployment of third-party applications on the RAN Intelligent Controllers (RICs).
Towards Applying Deep Learning to The Internet of Things: A Model and A FrameworkSamaa Elnagar, Kweku-Muata Osei-Bryson2024-12-18下载Deep Learning (DL) modeling has been a recent topic of interest. With the accelerating need to embed Deep Learning Networks (DLNs) to the Internet of Things (IoT) applications, many DL optimization te...
CoRa: A Collision-Resistant LoRa Symbol Detector of Low ComplexityJosé Álamos, Thomas C. Schmidt, Matthias Wählisch2024-12-18下载Long range communication with LoRa has become popular as it avoids the complexity of multi-hop communication at low cost and low energy consumption.
Resilience of Networks to Spreading Computer Viruses: Optimal Anti-Virus Deployment (Extended Version)Jhonatan Tavori, Hanoch Levy2024-12-18下载Deployment of anti-virus software is a common strategy for preventing and controlling the propagation of computer viruses and worms over a computer network.
Heterogeneous Multi-Agent Reinforcement Learning for Distributed Channel Access in WLANsJiaming Yu, Le Liang, Chongtao Guo, Ziyang Guo, Shi Jin, Geoffrey Ye Li2024-12-18下载This paper investigates the use of multi-agent reinforcement learning (MARL) to address distributed channel access in wireless local area networks.
Transmit What You Need: Task-Adaptive Semantic Communications for Visual InformationJeonghun Park, Sung Whan Yoon2024-12-18下载Recently, semantic communications have drawn great attention as the groundbreaking concept surpasses the limited capacity of Shannon's theory.
Magnifier: Detecting Network Access via Lightweight Traffic-based FingerprintsWenhao Li, Qiang Wang, Huaifeng Bao, Xiao-Yu Zhang, Lingyun Ying, Zhaoxuan Li2024-12-18下载Network access detection plays a crucial role in global network management, enabling efficient network monitoring and topology measurement by identifying unauthorized network access and gathering deta...

cs.PF - Performance

标题作者发布日期PDF摘要
On the Compression of Language Models for Code: An Empirical Study on CodeBERTGiordano d'Aloisio, Luca Traini, Federica Sarro, Antinisca Di Marco2024-12-18下载Language models have proven successful across a wide range of software engineering tasks, but their significant computational costs often hinder their practical adoption.
USEFUSE: Uniform Stride for Enhanced Performance in Fused Layer Architecture of Deep Neural NetworksMuhammad Sohail Ibrahim, Muhammad Usman, Jeong-A Lee2024-12-18下载Convolutional Neural Networks (CNNs) are crucial in various applications, but their deployment on resource-constrained edge devices poses challenges.

基于 VitePress 构建