Appearance
2024-10-16
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| RapidStream IR: Infrastructure for FPGA High-Level Physical Synthesis | Jason Lau, Yuanlong Xiao, Yutong Xie, Yuze Chi, Linghao Song, Shaojie Xiang, Michael Lo, Zhiru Zhang, Jason Cong, Licheng Guo | 2024-10-16 | 下载 | The increasing complexity of large-scale FPGA accelerators poses significant challenges in achieving high performance while maintaining design productivity. |
| Low-Power Encoding for PAM-3 DRAM Bus | Jonghyeon Nam, Jaeduk Han, Hokeun Kim | 2024-10-16 | 下载 | The 3-level pulse amplitude modulation (PAM-3) signaling is expected to be widely used in memory interfaces for its greater voltage margins compared to PAM-4. |
| Toleo: Scaling Freshness to Tera-scale Memory using CXL and PIM | Juechu Dong, Jonah Rosenblum, Satish Narayanasamy | 2024-10-16 | 下载 | Trusted hardware's freshness guarantee ensures that an adversary cannot replay an old value in response to a memory read request. They rely on maintaining a version number for each cache block and ens... |
| Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration | M. Croci, G. N. Wells | 2024-10-16 | 下载 | In this paper we develop the first fine-grained rounding error analysis of finite element (FE) cell kernels and assembly. The theory includes mixed-precision implementations and accounts for hardware-... |
| An O(m+n)-Space Spatiotemporal Denoising Filter with Cache-Like Memories for Dynamic Vision Sensors | Qinghang Zhao, Jiaqi Wang, Yixi Ji, Jinjian Wu, Guangming Shi | 2024-10-16 | 下载 | Dynamic vision sensor (DVS) is novel neuromorphic imaging device that generates asynchronous events. Despite the high temporal resolution and high dynamic range features, DVS is faced with background ... |
| COMET: Towards Partical W4A4KV4 LLMs Serving | Lian Liu, Haimeng Ren, Long Cheng, Zhaohui Xu, Yudong Pan, Mengdi Wang, Xiaowei Li, Yinhe Han, Ying Wang | 2024-10-16 | 下载 | Quantization is a widely-used compression technology to reduce the overhead of serving large language models (LLMs) on terminal devices and in cloud data centers. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| RapidStream IR: Infrastructure for FPGA High-Level Physical Synthesis | Jason Lau, Yuanlong Xiao, Yutong Xie, Yuze Chi, Linghao Song, Shaojie Xiang, Michael Lo, Zhiru Zhang, Jason Cong, Licheng Guo | 2024-10-16 | 下载 | The increasing complexity of large-scale FPGA accelerators poses significant challenges in achieving high performance while maintaining design productivity. |
| Boosting Asynchronous Decentralized Learning with Model Fragmentation | Sayan Biswas, Anne-Marie Kermarrec, Alexis Marouani, Rafael Pires, Rishi Sharma, Martijn de Vos | 2024-10-16 | 下载 | Decentralized learning (DL) is an emerging technique that allows nodes on the web to collaboratively train machine learning models without sharing raw data. Dealing with stragglers, i.e. |
| Vaccinating Federated Learning for Robust Modulation Classification in Distributed Wireless Networks | Hunmin Lee, Hongju Seong, Wonbin Kim, Hyeokchan Kwon, Daehee Seo | 2024-10-16 | 下载 | Automatic modulation classification (AMC) serves a vital role in ensuring efficient and reliable communication services within distributed wireless networks. |
| Toward Optimal-Complexity Hash-Based Asynchronous MVBA with Optimal Resilience | Jovan Komatovic, Joachim Neu, Tim Roughgarden | 2024-10-16 | 下载 | Multi-valued validated Byzantine agreement (MVBA), a fundamental primitive of distributed computing, allows processes to agree on a valid -bit value, despite faulty processes behaving ma... |
| HEnRY: A Multi-Agent System Framework for Multi-Domain Contexts | Emmanuele Lacavalla, Shuyi Yang, Riccardo Crupi, Joseph E. Gonzalez | 2024-10-16 | 下载 | This project, named HEnRY, aims to introduce a Multi-Agent System (MAS) into Intesa Sanpaolo. The name HEnRY summarizes the project's core principles: the Hierarchical organization of agents in a laye... |
| FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression | Zhenheng Tang, Xueze Kang, Yiming Yin, Xinglin Pan, Yuxin Wang, Xin He, Qiang Wang, Rongfei Zeng, Kaiyong Zhao, Shaohuai Shi, Amelie Chi Zhou, Bo Li, Bingsheng He, Xiaowen Chu | 2024-10-16 | 下载 | To alleviate hardware scarcity in training large deep neural networks (DNNs), particularly large language models (LLMs), we present FusionLLM, a decentralized training system designed and implemented ... |
| Optimization and Application of Cloud-based Deep Learning Architecture for Multi-Source Data Prediction | Yang Zhang, Fa Wang, Xin Huang, Xintao Li, Sibei Liu, Hansong Zhang | 2024-10-16 | 下载 | This study develops a cloud-based deep learning system for early prediction of diabetes, leveraging the distributed computing capabilities of the AWS cloud platform and deep learning technologies to a... |
| FALCON: Pinpointing and Mitigating Stragglers for Large-Scale Hybrid-Parallel Training | Tianyuan Wu, Wei Wang, Yinghao Yu, Siran Yang, Wenchao Wu, Qinkai Duan, Guodong Yang, Jiamang Wang, Lin Qu, Liping Zhang | 2024-10-16 | 下载 | Fail-slows, or stragglers, are common but largely unheeded problems in large-scale hybrid-parallel training that spans thousands of GPU servers and runs for weeks to months. |
| SEMSO: A Secure and Efficient Multi-Data Source Blockchain Oracle | Youquan Xian, Xueying Zeng, Chunpei Li, Peng Wang, Dongcheng Li, Peng Liu, Xianxian Li | 2024-10-16 | 下载 | In recent years, blockchain oracle, as the key link between blockchain and real-world data interaction, has greatly expanded the application scope of blockchain. |
| Disentangling data distribution for Federated Learning | Xinyuan Zhao, Hanlin Gu, Lixin Fan, Yuxing Han, Qiang Yang | 2024-10-16 | 下载 | Federated Learning (FL) facilitates collaborative training of a global model whose performance is boosted by private data owned by distributed clients, without compromising data privacy. |
| CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment | Qinfeng Li, Tianyue Luo, Xuhong Zhang, Yangfan Xie, Zhiqiang Shen, Lijun Zhang, Yier Jin, Hao Peng, Xinkui Zhao, Xianwei Zhu, Jianwei Yin | 2024-10-16 | 下载 | Proprietary large language models (LLMs) exhibit strong generalization capabilities across diverse tasks and are increasingly deployed on edge devices for efficiency and privacy reasons. |
| Federated Temporal Graph Clustering | Zihao Zhou, Yang Liu, Xianghong Xu, Qian Li | 2024-10-16 | 下载 | Temporal graph clustering is a complex task that involves discovering meaningful structures in dynamic graphs where relationships and entities change over time. |
| Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges | Handi Chen, Weipeng Deng, Shuo Yang, Jinfeng Xu, Zhihan Jiang, Edith C. H. Ngai, Jiangchuan Liu, Xue Liu | 2024-10-16 | 下载 | Edge Intelligence (EI) has been instrumental in delivering real-time, localized services by leveraging the computational capabilities of edge networks. |
| TPFL: A Trustworthy Personalized Federated Learning Framework via Subjective Logic | Jinqian Chen, Jihua Zhu | 2024-10-16 | 下载 | Federated learning (FL) enables collaborative model training across distributed clients while preserving data privacy. Despite its widespread adoption, most FL approaches focusing solely on privacy pr... |
| EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference | Yulei Qian, Fengcun Li, Xiangyang Ji, Xiaoyu Zhao, Jianchao Tan, Kefeng Zhang, Xunliang Cai | 2024-10-16 | 下载 | The Mixture-of-Experts (MoE) model has emerged as a prominent architecture in the field of Large Language Models (LLMs), providing a better balance between model performance and computational efficien... |
| EdgeRL: Reinforcement Learning-driven Deep Learning Model Inference Optimization at Edge | Motahare Mounesan, Xiaojie Zhang, Saptarshi Debroy | 2024-10-16 | 下载 | Balancing mutually diverging performance metrics, such as, processing latency, outcome accuracy, and end device energy consumption is a challenging undertaking for deep learning model inference in ad-... |
| SANSee: A Physical-layer Semantic-aware Networking Framework for Distributed Wireless Sensing | Huixiang Zhu, Yong Xiao, Yingyu Li, Guangming Shi, Marwan Krunz | 2024-10-16 | 下载 | Contactless device-free wireless sensing has recently attracted significant interest due to its potential to support a wide range of immersive human-machine interactive applications using ubiquitously... |
| Accelerating high-order continuum kinetic plasma simulations using multiple GPUs | Andrew Ho, Genia Vogman | 2024-10-16 | 下载 | Kinetic plasma simulations solve the Vlasov-Poisson or Vlasov-Maxwell equations to evolve scalar-variable distribution functions in position-velocity phase space and vector-variable electromagnetic fi... |
| Proof of Team Sprint: A Collaborative Consensus Algorithm for Reducing Energy Consumption in Blockchain Systems | Naoki Yonezawa | 2024-10-16 | 下载 | This paper introduces Proof of Team Sprint (PoTS), a novel consensus algorithm designed to address the significant energy inefficiencies inherent in traditional Proof of Work (PoW) systems. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Latency-Aware Inter-domain Routing | Shihan Lin, Yi Zhou, Xiao Zhang, Todd Arnold, Ramesh Govindan, Xiaowei Yang | 2024-10-16 | 下载 | Despite efforts from cloud and content providers to lower latency to acceptable levels for current and future services (e.g., augmented reality or cloud gaming), there are still opportunities for impr... |
| ORANSlice: An Open-Source 5G Network Slicing Platform for O-RAN | Hai Cheng, Salvatore D'Oro, Rajeev Gangula, Sakthivel Velumani, Davide Villa, Leonardo Bonati, Michele Polese, Gabriel Arrobo, Christian Maciocco, Tommaso Melodia | 2024-10-16 | 下载 | Network slicing allows Telecom Operators (TOs) to support service provisioning with diverse Service Level Agreements (SLAs). The combination of network slicing and Open Radio Access Network (RAN) enab... |
| IRS-aided Near-field Communication: Prospects and Challenges with Codebook Approach | Ryuhei Hibi, Hiroaki Hashida, Yuichi Kawamoto, Nei Kato | 2024-10-16 | 下载 | Intelligent reflecting surfaces (IRSs) are gaining attention as a low-cost solution to the coverage reduction in high-frequency bands used in next-generation communications. |
| Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges | Handi Chen, Weipeng Deng, Shuo Yang, Jinfeng Xu, Zhihan Jiang, Edith C. H. Ngai, Jiangchuan Liu, Xue Liu | 2024-10-16 | 下载 | Edge Intelligence (EI) has been instrumental in delivering real-time, localized services by leveraging the computational capabilities of edge networks. |
| LPUF-AuthNet: A Lightweight PUF-Based IoT Authentication via Tandem Neural Networks and Split Learning | Brahim Mefgouda, Raviha Khan, Omar Alhussein, Hani Saleh, Hossien B. Eldeeb, Anshul Pandey, Sami Muhaidat | 2024-10-16 | 下载 | By 2025, the internet of things (IoT) is projected to connect over 75 billion devices globally, fundamentally altering how we interact with our environments in both urban and rural settings. |
| SANSee: A Physical-layer Semantic-aware Networking Framework for Distributed Wireless Sensing | Huixiang Zhu, Yong Xiao, Yingyu Li, Guangming Shi, Marwan Krunz | 2024-10-16 | 下载 | Contactless device-free wireless sensing has recently attracted significant interest due to its potential to support a wide range of immersive human-machine interactive applications using ubiquitously... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| FALCON: Pinpointing and Mitigating Stragglers for Large-Scale Hybrid-Parallel Training | Tianyuan Wu, Wei Wang, Yinghao Yu, Siran Yang, Wenchao Wu, Qinkai Duan, Guodong Yang, Jiamang Wang, Lin Qu, Liping Zhang | 2024-10-16 | 下载 | Fail-slows, or stragglers, are common but largely unheeded problems in large-scale hybrid-parallel training that spans thousands of GPU servers and runs for weeks to months. |