Appearance
2024-12-18
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Optimizing ML Concurrent Computation and Communication with GPU DMA Engines | Anirudha Agrawal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam | 2024-12-18 | 下载 | Concurrent computation and communication (C3) is a pervasive paradigm in ML and other domains, making its performance optimization crucial. In this paper, we carefully characterize C3 in ML on GPUs, w... |
| Temperature-Resilient Analog Neuromorphic Chip in Single-Polysilicon CMOS Technology | Tommaso Rizzo, Sebastiano Strangio, Alessandro Catania, Giuseppe Iannaccone | 2024-12-18 | 下载 | In analog neuromorphic chips, designers can embed computing primitives in the intrinsic physical properties of devices and circuits, heavily reducing device count and energy consumption, and enabling ... |
| USEFUSE: Uniform Stride for Enhanced Performance in Fused Layer Architecture of Deep Neural Networks | Muhammad Sohail Ibrahim, Muhammad Usman, Jeong-A Lee | 2024-12-18 | 下载 | Convolutional Neural Networks (CNNs) are crucial in various applications, but their deployment on resource-constrained edge devices poses challenges. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Scaling Deep Learning Training with MPMD Pipeline Parallelism | Anxhelo Xhebraj, Sean Lee, Hanfeng Chen, Vinod Grover | 2024-12-18 | 下载 | We present JaxPP, a system for efficiently scaling the training of large deep learning models with flexible pipeline parallelism. We introduce a seamless programming model that allows implementing use... |
| Optimizing ML Concurrent Computation and Communication with GPU DMA Engines | Anirudha Agrawal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam | 2024-12-18 | 下载 | Concurrent computation and communication (C3) is a pervasive paradigm in ML and other domains, making its performance optimization crucial. In this paper, we carefully characterize C3 in ML on GPUs, w... |
| Split Learning in Computer Vision for Semantic Segmentation Delay Minimization | Nikos G. Evgenidis, Nikos A. Mitsiou, Sotiris A. Tegos, Panagiotis D. Diamantoulakis, George K. Karagiannidis | 2024-12-18 | 下载 | In this paper, we propose a novel approach to minimize the inference delay in semantic segmentation using split learning (SL), tailored to the needs of real-time computer vision (CV) applications for ... |
| Exploring User Acceptance of Blockchain-Based Student Certificate Sharing System: A Study on Non Fungible Token (NFT) Utilization | Prakhyat Khati, Ajay Kumar Shrestha, Julita Vassileva | 2024-12-18 | 下载 | Blockchain technology has emerged as a transformative tool for data management in a variety of industries, including fintech, research and healthcare. |
| Fast Leaderless Byzantine Total Order Broadcast | Matteo Monti, Martina Camaioni, Pierre-Louis Roman | 2024-12-18 | 下载 | This paper presents the Byzantine fault-tolerant agreement protocols Flutter and Blink. Both algorithms are deterministic, leaderless and signature-free; both assume partial synchrony and at least $(5... |
| A Survey on Inference Optimization Techniques for Mixture of Experts Models | Jiacheng Liu, Peng Tang, Wenfeng Wang, Yuhang Ren, Xiaofeng Hou, Pheng-Ann Heng, Minyi Guo, Chao Li | 2024-12-18 | 下载 | The emergence of large-scale Mixture of Experts (MoE) models represents a significant advancement in artificial intelligence, offering enhanced model capacity and computational efficiency through cond... |
| Unleashing the Power of Continual Learning on Non-Centralized Devices: A Survey | Yichen Li, Haozhao Wang, Wenchao Xu, Tianzhe Xiao, Hong Liu, Minzhu Tu, Yuying Wang, Xin Yang, Rui Zhang, Shui Yu, Song Guo, Ruixuan Li | 2024-12-18 | 下载 | Non-Centralized Continual Learning (NCCL) has become an emerging paradigm for enabling distributed devices such as vehicles and servers to handle streaming data from a joint non-stationary environment... |
| Rehearsal-Free Continual Federated Learning with Synergistic Synaptic Intelligence | Yichen Li, Yuying Wang, Haozhao Wang, Yining Qi, Tianzhe Xiao, Ruixuan Li | 2024-12-18 | 下载 | Continual Federated Learning (CFL) allows distributed devices to collaboratively learn novel concepts from continuously shifting training data while avoiding knowledge forgetting of previously seen ta... |
| SemiDFL: A Semi-Supervised Paradigm for Decentralized Federated Learning | Xinyang Liu, Pengchao Han, Xuan Li, Bo Liu | 2024-12-18 | 下载 | Decentralized federated learning (DFL) realizes cooperative model training among connected clients without relying on a central server, thereby mitigating communication bottlenecks and eliminating the... |
| Communication-Efficient Personalized Federal Graph Learning via Low-Rank Decomposition | Ruyue Liu, Rong Yin, Xiangzhen Bo, Xiaoshuai Hao, Xingrui Zhou, Yong Liu, Can Ma, Weiping Wang | 2024-12-18 | 下载 | Federated graph learning (FGL) has gained significant attention for enabling heterogeneous clients to process their private graph data locally while interacting with a centralized server, thus maintai... |
| Deploying Foundation Model Powered Agent Services: A Survey | Wenchao Xu, Jinyu Chen, Peirong Zheng, Xiaoquan Yi, Tianyi Tian, Wenhui Zhu, Quan Wan, Haozhao Wang, Yunfeng Fan, Qinliang Su, Xuemin Shen | 2024-12-18 | 下载 | Foundation model (FM) powered agent services are regarded as a promising solution to develop intelligent and personalized applications for advancing toward Artificial General Intelligence (AGI). |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| High-Accuracy and Efficient DV-Hop Localization for IoT Using Hop Loss | Zhengdi Shen, Qiran Wang | 2024-12-18 | 下载 | Accurate localization is critical for Internet of Things (IoT) applications. Using hop loss in DV-Hop-based algorithms is a promising approach. |
| Learning and Reconstructing Conflicts in O-RAN: A Graph Neural Network Approach | Arshia Zolghadr, Joao F. Santos, Luiz A. DaSilva, Jacek Kibiłda | 2024-12-18 | 下载 | The Open Radio Access Network (O-RAN) architecture enables the deployment of third-party applications on the RAN Intelligent Controllers (RICs). |
| Towards Applying Deep Learning to The Internet of Things: A Model and A Framework | Samaa Elnagar, Kweku-Muata Osei-Bryson | 2024-12-18 | 下载 | Deep Learning (DL) modeling has been a recent topic of interest. With the accelerating need to embed Deep Learning Networks (DLNs) to the Internet of Things (IoT) applications, many DL optimization te... |
| CoRa: A Collision-Resistant LoRa Symbol Detector of Low Complexity | José Álamos, Thomas C. Schmidt, Matthias Wählisch | 2024-12-18 | 下载 | Long range communication with LoRa has become popular as it avoids the complexity of multi-hop communication at low cost and low energy consumption. |
| Resilience of Networks to Spreading Computer Viruses: Optimal Anti-Virus Deployment (Extended Version) | Jhonatan Tavori, Hanoch Levy | 2024-12-18 | 下载 | Deployment of anti-virus software is a common strategy for preventing and controlling the propagation of computer viruses and worms over a computer network. |
| Heterogeneous Multi-Agent Reinforcement Learning for Distributed Channel Access in WLANs | Jiaming Yu, Le Liang, Chongtao Guo, Ziyang Guo, Shi Jin, Geoffrey Ye Li | 2024-12-18 | 下载 | This paper investigates the use of multi-agent reinforcement learning (MARL) to address distributed channel access in wireless local area networks. |
| Transmit What You Need: Task-Adaptive Semantic Communications for Visual Information | Jeonghun Park, Sung Whan Yoon | 2024-12-18 | 下载 | Recently, semantic communications have drawn great attention as the groundbreaking concept surpasses the limited capacity of Shannon's theory. |
| Magnifier: Detecting Network Access via Lightweight Traffic-based Fingerprints | Wenhao Li, Qiang Wang, Huaifeng Bao, Xiao-Yu Zhang, Lingyun Ying, Zhaoxuan Li | 2024-12-18 | 下载 | Network access detection plays a crucial role in global network management, enabling efficient network monitoring and topology measurement by identifying unauthorized network access and gathering deta... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| On the Compression of Language Models for Code: An Empirical Study on CodeBERT | Giordano d'Aloisio, Luca Traini, Federica Sarro, Antinisca Di Marco | 2024-12-18 | 下载 | Language models have proven successful across a wide range of software engineering tasks, but their significant computational costs often hinder their practical adoption. |
| USEFUSE: Uniform Stride for Enhanced Performance in Fused Layer Architecture of Deep Neural Networks | Muhammad Sohail Ibrahim, Muhammad Usman, Jeong-A Lee | 2024-12-18 | 下载 | Convolutional Neural Networks (CNNs) are crucial in various applications, but their deployment on resource-constrained edge devices poses challenges. |