Skip to content

2024-11-04

cs.AR - Architecture

标题作者发布日期PDF摘要
LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph GenerationMufei Li, Viraj Shitole, Eli Chien, Changhai Man, Zhaodong Wang, Srinivas Sridharan, Ying Zhang, Tushar Krishna, Pan Li2024-11-04下载Directed acyclic graphs (DAGs) serve as crucial data representations in domains such as hardware synthesis and compiler/program optimization for computing systems.
CXL-DMSim: A Full-System CXL Disaggregated Memory Simulator With Comprehensive Silicon ValidationYanjing Wang, Lizhou Wu, Wentao Hong, Yang Ou, Zicong Wang, Sunfeng Gao, Jie Zhang, Sheng Ma, Dezun Dong, Xingyun Qi, Mingche Lai, Nong Xiao2024-11-04下载Compute eXpress Link (CXL) has emerged as a key enabler of memory disaggregation for future heterogeneous computing systems to expand memory on-demand and improve resource utilization.
AssertLLM: Generating Hardware Verification Assertions from Design Specifications via Multi-LLMsZhiyuan Yan, Wenji Fang, Mengming Li, Min Li, Shang Liu, Zhiyao Xie, Hongce Zhang2024-11-04下载Assertion-based verification (ABV) is a critical method to ensure logic designs comply with their architectural specifications. ABV requires assertions, which are generally converted from specificatio...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Taming the Beast of User-Programmed Transactions on Blockchains: A Declarative Transaction ApproachNodirbek Korchiev, Akash Pateria, Vodelina Samatova, Sogolsadat Mansouri, Kemafor Anyanwu2024-11-04下载Blockchains are being positioned as the "technology of trust" that can be used to mediate transactions between non-trusting parties without the need for a central authority.
Configurable Non-uniform All-to-all AlgorithmsKe Fan, Jens Domke, Seydou Ba, Sidharth Kumar2024-11-04下载MPI_Alltoallv generalizes the uniform all-to-all communication (MPI_Alltoall) by enabling the exchange of data blocks of varied sizes among processes.
PipeLLM: Fast and Confidential Large Language Model Services with Speculative Pipelined EncryptionYifan Tan, Cheng Tan, Zeyu Mi, Haibo Chen2024-11-04下载Confidential computing on GPUs, like NVIDIA H100, mitigates the security risks of outsourced Large Language Models (LLMs) by implementing strong isolation and data encryption.
Fast and Robust Information Spreading in the Noisy PULL ModelNiccolò D'Archivio, Amos Korman, Emanuele Natale, Robin Vacus2024-11-04下载Understanding how information can efficiently spread in distributed systems under noisy communications is a fundamental question in both biological research and artificial system design.
Benchmarking Accuracy in an Emulated Memory ExperimentTim Chan2024-11-04下载This note proposes a simpler method to extract the logical error rate from an emulated surface code memory experiment.
Discrete the solving model of time-variant standard Sylvester-conjugate matrix equations using Euler-forward formulaJiakuang He, Dongqing Wu2024-11-04下载Time-variant standard Sylvester-conjugate matrix equations are presented as early time-variant versions of the complex conjugate matrix equations.
LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph GenerationMufei Li, Viraj Shitole, Eli Chien, Changhai Man, Zhaodong Wang, Srinivas Sridharan, Ying Zhang, Tushar Krishna, Pan Li2024-11-04下载Directed acyclic graphs (DAGs) serve as crucial data representations in domains such as hardware synthesis and compiler/program optimization for computing systems.
Memory-Efficient Community Detection on Large Graphs Using Weighted SketchesSubhajit Sahu2024-11-04下载Community detection in graphs identifies groups of nodes with denser connections within the groups than between them, and while existing studies often focus on optimizing detection performance, memory...
FedMoE-DA: Federated Mixture of Experts via Domain Aware Fine-grained AggregationZiwei Zhan, Wenkuan Zhao, Yuanqing Li, Weijie Liu, Xiaoxi Zhang, Chee Wei Tan, Chuan Wu, Deke Guo, Xu Chen2024-11-04下载Federated learning (FL) is a collaborative machine learning approach that enables multiple clients to train models without sharing their private data.
Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline ParallelismFan Wu, Muhammad Bilal, Haolong Xiang, Heng Wang, Jinjun Yu, Xiaolong Xu2024-11-04下载Railway Turnout Machines (RTMs) are mission-critical components of the railway transportation infrastructure, responsible for directing trains onto desired tracks.
Against Multifaceted Graph Heterogeneity via Asymmetric Federated Prompt LearningZhuoning Guo, Ruiqian Han, Hao Liu2024-11-04下载Federated Graph Learning (FGL) aims to collaboratively and privately optimize graph models on divergent data for different tasks. A critical challenge in FGL is to enable effective yet efficient feder...
FaaSTube: Optimizing GPU-oriented Data Transfer for Serverless ComputingHao Wu, Junxiao Deng, Minchen Yu, Yue Yu, Yaochen Liu, Hao Fan, Song Wu, Wei Wang2024-11-04下载Serverless computing has gained significant traction for machine learning inference applications, which are often deployed as serverless workflows consisting of multiple CPU and GPU functions with dat...
FedReMa: Improving Personalized Federated Learning via Leveraging the Most Relevant ClientsHan Liang, Ziwei Zhan, Weijie Liu, Xiaoxi Zhang, Chee Wei Tan, Xu Chen2024-11-04下载Federated Learning (FL) is a distributed machine learning paradigm that achieves a globally robust model through decentralized computation and periodic model synthesis, primarily focusing on the globa...
Minder: Faulty Machine Detection for Large-scale Distributed Model TrainingYangtao Deng, Xiang Shi, Zhuo Jiang, Xingjian Zhang, Lei Zhang, Zhang Zhang, Bo Li, Zuquan Song, Hang Zhu, Gaohong Liu, Fuliang Li, Shuguang Wang, Haibin Lin, Jianxi Ye, Minlan Yu2024-11-04下载Large-scale distributed model training requires simultaneous training on up to thousands of machines. Faulty machine detection is critical when an unexpected fault occurs in a machine.
Context Parallelism for Scalable Million-Token InferenceAmy Yang, Jingyi Yang, Aya Ibrahim, Xinfeng Xie, Bangsheng Tang, Grigory Sizov, Jeremy Reizenstein, Jongsoo Park, Jianyu Huang2024-11-04下载We present context parallelism for long-context large language model inference, which achieves near-linear scaling for long-context prefill latency with up to 128 H100 GPUs across 16 nodes.
xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive ParallelismJiarui Fang, Jinzhe Pan, Xibo Sun, Aoyu Li, Jiannan Wang2024-11-04下载Diffusion models are pivotal for generating high-quality images and videos. Inspired by the success of OpenAI's Sora, the backbone of diffusion models is evolving from U-Net to Transformer, known as D...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
TeleOracle: Fine-Tuned Retrieval-Augmented Generation with Long-Context Support for NetworkNouf Alabbasi, Omar Erak, Omar Alhussein, Ismail Lotfi, Sami Muhaidat, Merouane Debbah2024-11-04下载The telecommunications industry's rapid evolution demands intelligent systems capable of managing complex networks and adapting to emerging technologies.
LLM-based Continuous Intrusion Detection Framework for Next-Gen NetworksFrederic Adjewa, Moez Esseghir, Leila Merghem-Boulahia2024-11-04下载In this paper, we present an adaptive framework designed for the continuous detection, identification and classification of emerging attacks in network traffic.
Technical Report: Performance Comparison of Service Mesh Frameworks: the MTLS Test CaseAnat Bremler Barr, Ofek Lavi, Yaniv Naor, Sanjeev Rampal, Jhonatan Tavori2024-11-04下载Service Mesh has become essential for modern cloud-native applications by abstracting communication between microservices and providing zero-trust security, observability, and advanced traffic control...
A Survey on AI-driven Energy Optimisation in Terrestrial Next Generation Radio Access NetworksKishan Sthankiya, Nagham Saeed, Greg McSorley, Mona Jaber, Richard G. Clegg2024-11-04下载This survey uncovers the tension between AI techniques designed for energy saving in mobile networks and the energy demands those same techniques create.
Optimizing AoI at Query in Multiuser Wireless Uplink Networks: A Whittle Index ApproachJingwei Liu, He Chen2024-11-04下载In this paper, we explore how to schedule multiple users to optimize information freshness in a pull-based wireless network, where the status updates from users are requested by randomly arriving quer...
Real-time and Downtime-tolerant Fault Diagnosis for Railway Turnout Machines (RTMs) Empowered with Cloud-Edge Pipeline ParallelismFan Wu, Muhammad Bilal, Haolong Xiang, Heng Wang, Jinjun Yu, Xiaolong Xu2024-11-04下载Railway Turnout Machines (RTMs) are mission-critical components of the railway transportation infrastructure, responsible for directing trains onto desired tracks.
Adaptive Optimization of TLS Overhead for Wireless Communication in Critical InfrastructureJörn Bodenhausen, Laurenz Grote, Michael Rademacher, Martin Henze2024-11-04下载With critical infrastructure increasingly relying on wireless communication, using end-to-end security such as TLS becomes imperative. However, TLS introduces significant overhead for resource-constra...
A new control- and management architecture for SDN-enabled quantum key distribution networksPeter Horoschenkoff, Jasper Rödiger, Martin Wilske2024-11-04下载This paper aims to address the challenge of designing secure and high performance Quantum Key Distribution Networks (QKDN), which are essential for encrypted communication in the era of quantum comput...
Fairness-Utilization Trade-off in Wireless Networks with Explainable Kolmogorov-Arnold NetworksMasoud Shokrnezhad, Hamidreza Mazandarani, Tarik Taleb2024-11-04下载The effective distribution of user transmit powers is essential for the significant advancements that the emergence of 6G wireless networks brings.
Connection Performance Modeling and Analysis of a Radiosonde Network in a TyphoonHanyi Liu, Xianbin Cao, Peng Yang, Zehui Xiong, Tony Q. S. Quek, Dapeng Oliver Wu2024-11-04下载This paper is concerned with the theoretical modeling and analysis of uplink connection performance of a radiosonde network deployed in a typhoon.
Efficient Conflict Graph Creation for Time-Sensitive Networks with Dynamically Changing Communication DemandsHeiko Geppert, Frank Dürr, Kurt Rothermel2024-11-04下载Many applications of cyber-physical systems require real-time communication: manufacturing, automotive, etc. Recent Ethernet standards for Time Sensitive Networking (TSN) offer time-triggered scheduli...

基于 VitePress 构建