Skip to content

2024-09-23

cs.AR - Architecture

标题作者发布日期PDF摘要
Investigating Robot Dogs for Construction Monitoring: A Comparative Analysis of Specifications and On-site RequirementsMiguel Arturo Vega Torres, Fabian Pfitzner2024-09-23下载Robot dogs are receiving increasing attention in various fields of research. However, the number of studies investigating their potential usability on construction sites is scarce.
Location is Key: Leveraging Large Language Model for Functional Bug Localization in VerilogBingkun Yao, Ning Wang, Jie Zhou, Xi Wang, Hong Gao, Zhe Jiang, Nan Guan2024-09-23下载Bug localization in Verilog code is a crucial and time-consuming task during the verification of hardware design. Since introduction, Large Language Models (LLMs) have showed their strong programming ...
Chattronics: using GPTs to assist in the design of data acquisition systemsJonathan Paul Driemeyer Brown, Tiago Oliveira Weber2024-09-23下载The usefulness of Large Language Models (LLM) is being continuously tested in various fields. However, their intrinsic linguistic characteristic is still one of the limiting factors when applying thes...
Fast Virtual Gate Extraction For Silicon Quantum Dot DevicesShize Che, Seong W Oh, Haoyun Qin, Yuhao Liu, Anthony Sigillito, Gushu Li2024-09-23下载Silicon quantum dot devices stand as promising candidates for large-scale quantum computing due to their extended coherence times, compact size, and recent experimental demonstrations of sizable qubit...
Google Quantum AI's Quest for Error-Corrected Quantum ComputersM. AbuGhanem2024-09-23下载Quantum computers stand at the forefront of technological innovation, offering exponential computational speed-ups that challenge classical computing capabilities.
Exploring and Exploiting Runtime Reconfigurable Floating Point Precision in Scientific Computing: a Case Study for Solving PDEsCong "Callie" Hao2024-09-23下载Scientific computing applications, such as computational fluid dynamics and climate modeling, typically rely on 64-bit double-precision floating-point operations, which are extremely costly in terms o...
Analogous Alignments: Digital "Formally" meets AnalogHansa Mohanty, Deepak Narayan Gadde2024-09-23下载The complexity of modern-day System-on-Chips (SoCs) is continually increasing, and it becomes increasingly challenging to deliver dependable and credible chips in a short time-to-market.
FastGL: A GPU-Efficient Framework for Accelerating Sampling-Based GNN Training at Large ScaleZeyu Zhu, Peisong Wang, Qinghao Hu, Gang Li, Xiaoyao Liang, Jian Cheng2024-09-23下载Graph Neural Networks (GNNs) have shown great superiority on non-Euclidean graph data, achieving ground-breaking performance on various graph-related tasks.
A Realistic Simulation Framework for Analog/Digital Neuromorphic ArchitecturesFernando M. Quintana, Maryada, Pedro L. Galindo, Elisa Donati, Giacomo Indiveri, Fernando Perez-Peña2024-09-23下载Developing dedicated mixed-signal neuromorphic computing systems optimized for real-time sensory-processing in extreme edge-computing applications requires time-consuming design, fabrication, and depl...
Efficient Tabular Data Preprocessing of ML PipelinesYu Zhu, Wenqi Jiang, Gustavo Alonso2024-09-23下载Data preprocessing pipelines, which includes data decoding, cleaning, and transforming, are a crucial component of Machine Learning (ML) training.
MICSim: A Modular Simulator for Mixed-signal Compute-in-Memory based AI AcceleratorCong Wang, Zeming Chen, Shanshi Huang2024-09-23下载This work introduces MICSim, an open-source, pre-circuit simulator designed for early-stage evaluation of chip-level software performance and hardware overhead of mixed-signal compute-in-memory (CIM) ...
MESC: Re-thinking Algorithmic Priority and/or Criticality Inversions for Heterogeneous MCSsJiapeng Guan, Ran Wei, Dean You, Yingquan Wang, Ruizhe Yang, Hui Wang, Zhe Jiang2024-09-23下载Modern Mixed-Criticality Systems (MCSs) rely on hardware heterogeneity to satisfy ever-increasing computational demands. However, most of the heterogeneous co-processors are designed to achieve high t...
Hardware/Algorithm Co-design for Real-Time I/O Control with Improved Timing Accuracy and RobustnessZhe Jiang, Shuai Zhao, Ran Wei, Xin Si, Gang Chen, Nan Guan2024-09-23下载In safety-critical systems, timing accuracy is the key to achieving precise I/O control. To meet such strict timing requirements, dedicated hardware assistance has recently been investigated and devel...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
PipeFill: Using GPUs During Bubbles in Pipeline-parallel LLM TrainingDaiyaan Arfeen, Zhen Zhang, Xinwei Fu, Gregory R. Ganger, Yida Wang2024-09-23下载Training Deep Neural Networks (DNNs) with billions of parameters generally involves pipeline-parallel (PP) execution. Unfortunately, PP model training can use GPUs inefficiently, especially at large s...
Stalactite: Toolbox for Fast Prototyping of Vertical Federated Learning SystemsAnastasiia Zakharova, Dmitriy Alexandrov, Maria Khodorchenko, Nikolay Butakov, Alexey Vasilev, Maxim Savchenko, Alexander Grigorievskiy2024-09-23下载Machine learning (ML) models trained on datasets owned by different organizations and physically located in remote databases offer benefits in many real-world use cases.
MobiZO: Enabling Efficient LLM Fine-Tuning at the Edge via Inference EnginesLei Gao, Amir Ziashahabi, Yue Niu, Salman Avestimehr, Murali Annavaram2024-09-23下载Large Language Models (LLMs) are currently pre-trained and fine-tuned on large cloud servers. The next frontier is LLM personalization, where a foundation model can be fine-tuned with user/task-specif...
Parallel Dynamic Maximal MatchingMohsen Ghaffari, Anton Trygub2024-09-23下载We present the first (randomized) parallel dynamic algorithm for maximal matching, which can process an arbitrary number of updates simultaneously.
A Near-Optimal Low-Energy Deterministic Distributed SSSP with Ramifications on Congestion and APSPMohsen Ghaffari, Anton Trygub2024-09-23下载We present a low-energy deterministic distributed algorithm that computes exact Single-Source Shortest Paths (SSSP) in near-optimal time: it runs in O~(n)\tilde{O}(n) rounds and each node is awake during...
Renaming in distributed certificationNicolas Bousquet, Louis Esperet, Laurent Feuilloley, Sébastien Zeitoun2024-09-23下载Local certification is the area of distributed network computing asking the following question: How to certify to the nodes of a network that a global property holds, if they are limited to a local ve...
Domino: Eliminating Communication in LLM Training via Generic Tensor Slicing and OverlappingGuanhua Wang, Chengming Zhang, Zheyu Shen, Ang Li, Olatunji Ruwase2024-09-23下载Given the popularity of generative AI, Large Language Models (LLMs) often consume hundreds or thousands of GPUs for parallelizing and accelerating the training process.
FLeNS: Federated Learning with Enhanced Nesterov-Newton SketchSunny Gupta, Mohit Jindal, Pankhi Kashyap, Pranav Jeevan, Amit Sethi2024-09-23下载Federated learning faces a critical challenge in balancing communication efficiency with rapid convergence, especially for second-order methods.
PecSched: Preemptive and Efficient Cluster Scheduling for LLM InferenceZeyu Zhang, Haiying Shen2024-09-23下载The scaling of transformer-based Large Language Models (LLMs) has significantly expanded their context lengths, enabling applications where inputs exceed 100K tokens.
Cucheb: A GPU implementation of the filtered Lanczos procedureJared L. Aurentz, Vassilis Kalantzis, Yousef Saad2024-09-23下载This paper describes the software package Cucheb, a GPU implementation of the filtered Lanczos procedure for the solution of large sparse symmetric eigenvalue problems.
UELLM: A Unified and Efficient Approach for LLM Inference ServingYiyuan He, Minxian Xu, Jingfeng Wu, Wanyi Zheng, Kejiang Ye, Chengzhong Xu2024-09-23下载In the context of Machine Learning as a Service (MLaaS) clouds, the extensive use of Large Language Models (LLMs) often requires efficient management of significant query loads.
MSARS: A Meta-Learning and Reinforcement Learning Framework for SLO Resource Allocation and Adaptive Scaling for MicroservicesKan Hu, Linfeng Wen, Minxian Xu, Kejiang Ye2024-09-23下载Service Level Objectives (SLOs) aim to set threshold for service time in cloud services to ensure acceptable quality of service (QoS) and user satisfaction.
FastGL: A GPU-Efficient Framework for Accelerating Sampling-Based GNN Training at Large ScaleZeyu Zhu, Peisong Wang, Qinghao Hu, Gang Li, Xiaoyao Liang, Jian Cheng2024-09-23下载Graph Neural Networks (GNNs) have shown great superiority on non-Euclidean graph data, achieving ground-breaking performance on various graph-related tasks.
Energy-Aware Federated Learning in Satellite ConstellationsNasrin Razmi, Bho Matthiesen, Armin Dekorsy, Petar Popovski2024-09-23下载Federated learning in satellite constellations, where the satellites collaboratively train a machine learning model, is a promising technology towards enabling globally connected intelligence and the ...
Machine Learning Methods as Robust Quantum Noise EstimatorsJon Gardeazabal-Gutierrez, Erik B. Terres-Escudero, Pablo García Bringas2024-09-23下载Access to quantum computing is steadily increasing each year as the speed advantage of quantum computers solidifies with the growing number of usable qubits.
Federated Graph Learning with Adaptive Importance-based SamplingAnran Li, Yuanyuan Chen, Chao Ren, Wenhan Wang, Ming Hu, Tianlin Li, Han Yu, Qingyu Chen2024-09-23下载For privacy-preserving graph learning tasks involving distributed graph datasets, federated learning (FL)-based GCN (FedGCN) training is required.
MECURY: Practical Cross-Chain Exchange via Trusted HardwareXiaoqing Wen, Quanbi Feng, Jianyu Niu, Yinqian Zhang, Chen Feng2024-09-23下载The proliferation of blockchain-backed cryptocurrencies has sparked the need for cross-chain exchanges of diverse digital assets. Unfortunately, current exchanges suffer from high on-chain verificatio...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
On-Air Deep Learning Integrated Semantic Inference Models for Enhanced Earth Observation Satellite NetworksHong-fu Chou, Vu Nguyen Ha, Prabhu Thiruvasagam, Thanh-Dung Le, Geoffrey Eappen, Ti Ti Nguyen, Luis M. Garces-Socarras, Jorge L. Gonzalez-Rios, Juan Carlos Merlano-Duncan, Symeon Chatzinotas2024-09-23下载Earth Observation (EO) systems are crucial for cartography, disaster surveillance, and resource administration. Nonetheless, they encounter considerable obstacles in the processing and transmission of...
Intelligent Routing Algorithm over SDN: Reusable Reinforcement Learning ApproachWang Wumian, Sajal Saha, Anwar Haque, Greg Sidebottom2024-09-23下载Traffic routing is vital for the proper functioning of the Internet. As users and network traffic increase, researchers try to develop adaptive and intelligent routing algorithms that can fulfill vari...
Architectural Challenges of Nomadic Networks in 6GDaniel Lindenschmitt, Benedikt Veith, Khurshid Alam, Ainur Daurembekova, Michael Gundall, Mohammad Asif Habibi, Bin Han, Dennis Krummacker, Philipp Rosemann, Hans D. Schotten2024-09-23下载This paper examines architectural challenges and opportunities arising from Nomadic Networks in the context of emerging 6G research. Nomadic networks are proposed as a solution to the limitations of s...
Improved Routing of Multiparty Entanglement over Quantum NetworksNirupam Basak, Goutam Paul2024-09-23下载Effective routing of entanglements over a quantum network is a fundamental problem in quantum communication. Due to the fragility of quantum states, it is difficult to route entanglements at long dist...

cs.PF - Performance

标题作者发布日期PDF摘要
FRSZ2 for In-Register Block Compression Inside GMRES on GPUsThomas Grützmacher, Robert Underwood, Sheng Di, Franck Cappello, Hartwig Anzt2024-09-23下载The performance of the GMRES iterative solver on GPUs is limited by the GPU main memory bandwidth. Compressed Basis GMRES outperforms GMRES by storing the Krylov basis in low precision, thereby reduci...
Deploying Open-Source Large Language Models: A performance AnalysisYannis Bendi-Ouis, Dan Dutartre, Xavier Hinaut2024-09-23下载Since the release of ChatGPT in November 2022, large language models (LLMs) have seen considerable success, including in the open-source community, with many open-weight models available.

基于 VitePress 构建