Skip to content

2024-10-31

cs.AR - Architecture

标题作者发布日期PDF摘要
UpANNS: Enhancing Billion-Scale ANNS Efficiency with Real-World PIM ArchitectureSitian Chen, Amelie Chi Zhou, Yucheng Shi, Yusen Li, Xin Yao2024-10-31下载Approximate Nearest Neighbor Search (ANNS) is a critical component of modern AI systems, such as recommendation engines and retrieval-augmented large language models (RAG-LLMs).
Kernel Looping: Eliminating Synchronization Boundaries for Peak Inference PerformanceDavid Koeplinger, Darshan Gandhi, Pushkar Nandkar, Nathan Sheeley, Matheen Musaddiq, Leon Zhang, Reid Goodbar, Matthew Shaffer, Han Wang, Angela Wang, Mingran Wang, Raghu Prabhakar2024-10-31下载Token generation speed is critical to power the next wave of AI inference applications. GPUs significantly underperform during token generation due to synchronization overheads at kernel boundaries, u...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Novel Architecture for Distributed Travel Data Integration and Service Provision Using MicroservicesBiman Barua, M. Shamim Kaiser2024-10-31下载This paper introduces a microservices architecture for the purpose of enhancing the flexibility and performance of an airline reservation system.
Revisiting the Schedule Graph Generation for the Exact and Sustainable Analysis of Non-preemptive SchedulingMarek Vlk, Marek Jaros, Zdenek Hanzalek2024-10-31下载This paper addresses the problem of scheduling non-preemptive tasks with release jitter and execution time variation on a uniprocessor. We show that the schedulability analysis based on schedule graph...
DynaSplit: A Hardware-Software Co-Design Framework for Energy-Aware Inference on EdgeDaniel May, Alessandro Tundo, Shashikant Ilager, Ivona Brandic2024-10-31下载The deployment of ML models on edge devices is challenged by limited computational resources and energy availability. While split computing enables the decomposition of large neural networks (NNs) and...
ECDQC: Efficient Compilation for Distributed Quantum Computing with Linear LayoutKecheng Liu, Yidong Zhou, Haochen Luo, Lingjun Xiong, Yuchen Zhu, Eilis Casey, Jinglei Cheng, Samuel Yen-Chi Chen, Zhiding Liang2024-10-31下载In this paper, we propose an efficient compilation method for distributed quantum computing (DQC) using the Linear Nearest Neighbor (LNN) architecture.
Abstract Continuation Semantics for Multiparty Interactions in Process Calculi based on CCSEneia Nicolae Todoran, Gabriel Ciobanu2024-10-31下载We develop denotational and operational semantics designed with continuations for process calculi based on CCS extended with mechanisms offering support for multiparty interactions.
Syno: Structured Synthesis for Neural OperatorsYongqi Zhuo, Zhengyuan Su, Chenggang Zhao, Mingyu Gao2024-10-31下载The desires for better prediction accuracy and higher execution performance in neural networks never end. Neural architecture search (NAS) and tensor compilers are two popular techniques to optimize t...
Microsecond-scale Dynamic Validation of Idempotency for GPU KernelsMingcong Han, Weihang Shen, Guanwen Peng, Rong Chen, Haibo Chen2024-10-31下载We discovered that a GPU kernel can have both idempotent and non-idempotent instances depending on the input. These kernels, called conditionally-idempotent, are prevalent in real-world GPU applicatio...
EVeCA: Efficient and Verifiable On-Chain Data Query Framework Using Challenge-Based AuthenticationMeng Shen, Yuzhi Liu, Qinglin Zhao, Wei Wang, Wei Ou, Wenbao Han, Liehuang Zhu2024-10-31下载As blockchain applications become increasingly widespread, there is a rising demand for on-chain data queries. However, existing schemes for on-chain data queries face a challenge between verifiabilit...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Globalping: A Community-Driven, Open-Source Platform for Scalable, Real-Time Network MeasurementsBerkay Kaplan2024-10-31下载We present Globalping, an open-source, community-driven platform for scalable, real-time global network measurements. It democratizes access to network diagnostics by offering every user, including no...
DECT-2020 NR Link Distance Performance in Varying Environments: Models and MeasurementsMd Mohaiminul Haque, Joonas Sae, Juho Pirskanen, Mikko Valkama2024-10-31下载Digital Enhanced Cordless Telecommunications 2020 New Radio (DECT-2020 NR) has garnered recognition as an alternative for cellular 5G technology in the internet of things industry.
Efficient Satellite-Ground Interconnection Design for Low-orbit Mega-Constellation TopologyWenhao Liu, Jiazhi Wu, Quanwei Lin, Handong Luo, Qi Zhang, Kun Qiu, Zhe Chen, Yue Gao2024-10-31下载The low-orbit mega-constellation network (LMCN) is an important part of the space-air-ground integrated network system. An effective satellite-ground interconnection design can result in a stable cons...
Distributing Intelligence in 6G Programmable Data Planes for Effective In-Network Intrusion PreventionMattia G. Spina, Floriano De Rango, Edoardo Scalzo, Francesca Guerriero, Antonio Iera2024-10-31下载The problem of attacks on new generation network infrastructures is becoming increasingly relevant, given the widening of the attack surface of these networks resulting from the greater number of devi...
Deep Learning Frameworks for Cognitive Radio Networks: Review and Open Research ChallengesSenthil Kumar Jagatheesaperumal, Ijaz Ahmad, Marko Höyhtyä, Suleman Khan, Andrei Gurtov2024-10-31下载Deep learning has been proven to be a powerful tool for addressing the most significant issues in cognitive radio networks, such as spectrum sensing, spectrum sharing, resource allocation, and securit...
Tracer: A Tool for Race Detection in Software Defined Network ModelsGeorgiana Caltais, Mahboobeh Zangiabady, Ervin Zvirbulis2024-10-31下载Software Defined Networking (SDN) has become a new paradigm in computer networking, introducing a decoupled architecture that separates the network into the data plane and the control plane.
Integrating Brain-Computer Interface and Neuromorphic Computing for Human Digital TwinsChen Shang, Jiadong Yu, Dinh Thai Hoang2024-10-31下载The integration of immersive communication into a human-centric ecosystem has intensified the demand for sophisticated Human Digital Twins (HDTs) driven by multifaceted human data.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Microsecond-scale Dynamic Validation of Idempotency for GPU KernelsMingcong Han, Weihang Shen, Guanwen Peng, Rong Chen, Haibo Chen2024-10-31下载We discovered that a GPU kernel can have both idempotent and non-idempotent instances depending on the input. These kernels, called conditionally-idempotent, are prevalent in real-world GPU applicatio...

cs.PF - Performance

标题作者发布日期PDF摘要
Efficient Performance Analysis of Modular Rewritable Petri NetsLorenzo Capra, Marco Gribaudo2024-10-31下载Petri Nets (PN) are extensively used as a robust formalism to model concurrent and distributed systems; however, they encounter difficulties in accurately modeling adaptive systems.
Syno: Structured Synthesis for Neural OperatorsYongqi Zhuo, Zhengyuan Su, Chenggang Zhao, Mingyu Gao2024-10-31下载The desires for better prediction accuracy and higher execution performance in neural networks never end. Neural architecture search (NAS) and tensor compilers are two popular techniques to optimize t...
ALISE: Accelerating Large Language Model Serving with Speculative SchedulingYoupeng Zhao, Jun Wang2024-10-31下载Large Language Models (LLMs) represent a revolutionary advancement in the contemporary landscape of artificial general intelligence (AGI). As exemplified by ChatGPT, LLM-based applications necessitate...

基于 VitePress 构建