Skip to content

2024-06-24

cs.AR - Architecture

标题作者发布日期PDF摘要
LLM-Aided Testbench Generation and Bug Detection for Finite-State MachinesJitendra Bhandari, Johann Knechtel, Ramesh Narayanaswamy, Siddharth Garg, Ramesh Karri2024-06-24下载This work investigates the potential of tailoring Large Language Models (LLMs), specifically GPT3.5 and GPT4, for the domain of chip testing. A key aspect of chip design is functional testing, which r...
Mooncake: A KVCache-centric Disaggregated Architecture for LLM ServingRuoyu Qin, Zheming Li, Weiran He, Mingxing Zhang, Yongwei Wu, Weimin Zheng, Xinran Xu2024-06-24下载Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI. It features a KVCache-centric disaggregated architecture that separates the prefill and decoding clusters.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Robust Zero Trust Architecture: Joint Blockchain based Federated learning and Anomaly Detection based FrameworkShiva Raj Pokhrel, Luxing Yang, Sutharshan Rajasegarar, Gang Li2024-06-24下载This paper introduces a robust zero-trust architecture (ZTA) tailored for the decentralized system that empowers efficient remote work and collaboration within IoT networks.
GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline ParallelismByungsoo Jeon, Mengdi Wu, Shiyi Cao, Sunghyun Kim, Sunghyun Park, Neeraj Aggarwal, Colin Unger, Daiyaan Arfeen, Peiyuan Liao, Xupeng Miao, Mohammad Alizadeh, Gregory R. Ganger, Tianqi Chen, Zhihao Jia2024-06-24下载Deep neural networks (DNNs) continue to grow rapidly in size, making them infeasible to train on a single device. Pipeline parallelism is commonly used in existing DNN systems to support large-scale D...
Scalable Artificial Intelligence for Science: Perspectives, Methods and ExemplarsWesley Brewer, Aditya Kashi, Sajal Dash, Aristeidis Tsaris, Junqi Yin, Mallikarjun Shankar, Feiyi Wang2024-06-24下载In a post-ChatGPT world, this paper explores the potential of leveraging scalable artificial intelligence for scientific discovery. We propose that scaling up artificial intelligence on high-performan...
Fast Switching Serial and Parallel Paradigms of SNN Inference on Multi-core Heterogeneous Neuromorphic Platform SpiNNaker2Jiaxin Huang, Bernhard Vogginger, Florian Kelber, Hector Gonzalez, Klaus Knobloch, Christian Georg Mayr2024-06-24下载With serial and parallel processors introduced into Spiking Neural Networks (SNNs) execution, more and more researchers are dedicated to improving the performance of the computing paradigms by taking ...
A Multi-Party, Multi-Blockchain Atomic Swap Protocol with Universal Adaptor SecretShengewei You, Aditya Joshi, Andrey Kuehlkamp, Jarek Nabrzyski2024-06-24下载The increasing complexity of digital asset transactions across multiple blockchains necessitates a robust atomic swap protocol that can securely handle more than two participants.
Bisimulation for Impure Simplicial ComplexesMarta Bílková, Hans van Ditmarsch, Roman Kuznets, Rojo Randrianomentsoa2024-06-24下载As an alternative to Kripke models, simplicial complexes are a versatile semantic primitive on which to interpret epistemic logic. Given a set of vertices, a simplicial complex is a downward closed se...
Towards Communication-Efficient Peer-to-Peer NetworksKhalid Hourani, William K. Moses, Gopal Pandurangan2024-06-24下载We focus on designing Peer-to-Peer (P2P) networks that enable efficient communication. Over the last two decades, there has been substantial algorithmic research on distributed protocols for building ...
Digital Twinning of a Pressurized Water Reactor Startup Operation and Partial Computational Offloading in In-network Computing-Assisted Multiaccess Edge ComputingIbrahim Aliyu, Awwal M. Arigi, Tai-Won Um, Jinsul Kim2024-06-24下载This paper addresses the challenge of representing complex human action (HA) in a nuclear power plant (NPP) digital twin (DT) and minimizing latency in partial computation offloading (PCO) in sixth-ge...
Semantic Revolution from Communications to Orchestration for 6G: Challenges, Enablers, and Research DirectionsMasoud Shokrnezhad, Hamidreza Mazandarani, Tarik Taleb, Jaeseung Song, Richard Li2024-06-24下载In the context of emerging 6G services, the realization of everything-to-everything interactions involving a myriad of physical and digital entities presents a crucial challenge.
Decentralized Task Offloading and Load-Balancing for Mobile Edge Computing in Dense NetworksMariam Yahya, Alexander Conzelmann, Setareh Maghsudi2024-06-24下载We study the problem of decentralized task offloading and load-balancing in a dense network with numerous devices and a set of edge servers. Solving this problem optimally is complicated due to the un...
Placing Timely Refreshing Services at the Network EdgeXishuo Li, Shan Zhang, Hongbin Luo, Xiao Ma, Junyi He2024-06-24下载Accommodating services at the network edge is favorable for time-sensitive applications. However, maintaining service usability is resource-consuming in terms of pulling service images to the edge, sy...
Mooncake: A KVCache-centric Disaggregated Architecture for LLM ServingRuoyu Qin, Zheming Li, Weiran He, Mingxing Zhang, Yongwei Wu, Weimin Zheng, Xinran Xu2024-06-24下载Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI. It features a KVCache-centric disaggregated architecture that separates the prefill and decoding clusters.
Evaluating Serverless Machine Learning Performance on Google Cloud RunPrerana Khatiwada, Pranjal Dhakal2024-06-24下载End-users can get functions-as-a-service from serverless platforms, which promise lower hosting costs, high availability, fault tolerance, and dynamic flexibility for hosting individual functions know...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Integrating Generative AI with Network Digital Twins for Enhanced Network OperationsKassi Muhammad, Teef David, Giulia Nassisid, Tina Farus2024-06-24下载As telecommunications networks become increasingly complex, the integration of advanced technologies such as network digital twins and generative artificial intelligence (AI) emerges as a pivotal solu...
Digital Twinning of a Pressurized Water Reactor Startup Operation and Partial Computational Offloading in In-network Computing-Assisted Multiaccess Edge ComputingIbrahim Aliyu, Awwal M. Arigi, Tai-Won Um, Jinsul Kim2024-06-24下载This paper addresses the challenge of representing complex human action (HA) in a nuclear power plant (NPP) digital twin (DT) and minimizing latency in partial computation offloading (PCO) in sixth-ge...
Semantic Revolution from Communications to Orchestration for 6G: Challenges, Enablers, and Research DirectionsMasoud Shokrnezhad, Hamidreza Mazandarani, Tarik Taleb, Jaeseung Song, Richard Li2024-06-24下载In the context of emerging 6G services, the realization of everything-to-everything interactions involving a myriad of physical and digital entities presents a crucial challenge.
A Queuing Envelope Model for Estimating Latency Guarantees in Deterministic Networking ScenariosNataliia Koneva, Alfonso Sánchez-Macián, José Alberto Hernández, Farhad Arpanaei, Óscar González de Dios2024-06-24下载Accurate estimation of queuing delays is crucial for designing and optimizing communication networks, particularly in the context of Deterministic Networking (DetNet) scenarios.
Placing Timely Refreshing Services at the Network EdgeXishuo Li, Shan Zhang, Hongbin Luo, Xiao Ma, Junyi He2024-06-24下载Accommodating services at the network edge is favorable for time-sensitive applications. However, maintaining service usability is resource-consuming in terms of pulling service images to the edge, sy...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Evaluating Serverless Machine Learning Performance on Google Cloud RunPrerana Khatiwada, Pranjal Dhakal2024-06-24下载End-users can get functions-as-a-service from serverless platforms, which promise lower hosting costs, high availability, fault tolerance, and dynamic flexibility for hosting individual functions know...

cs.PF - Performance

标题作者发布日期PDF摘要
Enabling more efficient and cost-effective AI/ML systems with Collective Mind, virtualized MLOps, MLPerf, Collective Knowledge Playground and reproducible optimization tournamentsGrigori Fursin2024-06-24下载This white paper introduces my educational community initiative to learn how to run AI, ML and other emerging workloads in the most efficient and cost-effective way across diverse models, data sets, s...

基于 VitePress 构建