Appearance
2025-03-04
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Enabling Fast, Accurate, and Efficient Real-Time Genome Analysis via New Algorithms and Techniques | Can Firtina | 2025-03-04 | 下载 | The advent of high-throughput sequencing technologies has revolutionized genome analysis by enabling the rapid and cost-effective sequencing of large genomes. |
| SWAPPER: Dynamic Operand Swapping in Non-commutative Approximate Circuits for Online Error Reduction | Marcello Traiola, Nazar Misyats, Silviu-Ioan Filip, Remi Garcia, Angeliki Kritikakou | 2025-03-04 | 下载 | Error-tolerant applications, such as multimedia processing, machine learning, signal processing, and scientific computing, can produce satisfactory outputs even when approximate computations are perfo... |
| CORDIC Is All You Need | Omkar Kokane, Adam Teman, Anushka Jha, Guru Prasath SL, Gopal Raut, Mukul Lokhande, S. V. Jaya Chand, Tanushree Dewangan, Santosh Kumar Vishvakarma | 2025-03-04 | 下载 | Artificial intelligence necessitates adaptable hardware accelerators for efficient high-throughput million operations. We present pipelined architecture with CORDIC block for linear MAC computations a... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A HPX Communication Benchmark: Distributed FFT using Collectives | Alexander Strack, Dirk Pflüger | 2025-03-04 | 下载 | Due to increasing core counts in modern processors, several task-based runtimes emerged, including the C++ Standard Library for Concurrency and Parallelism (HPX). |
| Comparative Analysis of Lightweight Kubernetes Distributions for Edge Computing: Performance and Resource Efficiency | Diyaz Yakubov, David Hästbacka | 2025-03-04 | 下载 | Edge computing environments increasingly rely on lightweight container orchestration platforms to manage resource-constrained devices. This paper provides an empirical analysis of five lightweight kub... |
| Deal: Distributed End-to-End GNN Inference for All Nodes | Shiyang Chen, Xiang Song, Vasiloudis Theodore, Hang Liu | 2025-03-04 | 下载 | Graph Neural Networks (GNNs) are a new research frontier with various applications and successes. The end-to-end inference for all nodes, is common for GNN embedding models, which are widely adopted i... |
| ESSPI: ECDSA/Schnorr Signed Program Input for BitVMX | Sergio Demian Lerner, Martin Jonas, Ariel Futoransky | 2025-03-04 | 下载 | The BitVM and BitVMX protocols have long relied on inefficient one-time signature (OTS) schemes like Lamport and Winternitz for signing program inputs. |
| A Sheaf-Theoretic Characterization of Tasks in Distributed Systems | Stephan Felber, Bernardo Hummes Flores, Hugo Rincon Galeana | 2025-03-04 | 下载 | We introduce a sheaf-theoretic characterization of task solvability in general distributed computing models, unifying distinct approaches to message-passing models. |
| SpecInF: Exploiting Idle GPU Resources in Distributed DL Training via Speculative Inference Filling | Cunchi Lv, Xiao Shi, Dong Liang, Wenting Tan, Xiaofang Zhao | 2025-03-04 | 下载 | Deep Learning (DL), especially with Large Language Models (LLMs), brings benefits to various areas. However, DL training systems usually yield prominent idling GPU resources due to many factors, such ... |
| Introducing Support for Move Operations in Melda CRDT | Amos Brocco | 2025-03-04 | 下载 | In this paper, we present an extension to Melda (a library which implements a general purpose delta state JSON CRDT) to support move operations. |
| Memory and Bandwidth are All You Need for Fully Sharded Data Parallel | Jiangtao Wang, Jan Ebert, Oleg Filatov, Stefan Kesselheim | 2025-03-04 | 下载 | Transformer models have revolutionized a wide spectrum of disciplines, especially in language processing. The recent success has proven that model size scalability is crucial for achieving superior pe... |
| 3-Majority and 2-Choices with Many Opinions | Nobutaka Shimizu, Takeharu Shiraga | 2025-03-04 | 下载 | We present the first nearly-optimal bounds on the consensus time for the well-known synchronous consensus dynamics, specifically 3-Majority and 2-Choices, for an arbitrary number of opinions. |
| Efficient Long Context Fine-tuning with Chunk Flow | Xiulong Yuan, Hongtao Xu, Wenting Shen, Ang Wang, Xiafei Qiu, Jie Zhang, Yuqiong Liu, Bowen Yu, Junyang Lin, Mingzhen Li, Weile Jia, Yong Li, Wei Lin | 2025-03-04 | 下载 | Long context fine-tuning of large language models(LLMs) involves training on datasets that are predominantly composed of short sequences and a small proportion of longer sequences. |
| CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory | Jiashun Suo, Xiaojian Liao, Limin Xiao, Li Ruan, Jinquan Wang, Xiao Su, Zhisheng Huo | 2025-03-04 | 下载 | Large language models like GPT-4 are resource-intensive, but recent advancements suggest that smaller, specialized experts can outperform the monolithic models on specific tasks. |
| PointSplit: Towards On-device 3D Object Detection with Heterogeneous Low-power Accelerators | Keondo Park, You Rim Choi, Inhoe Lee, Hyung-Sin Kim | 2025-03-04 | 下载 | Running deep learning models on resource-constrained edge devices has drawn significant attention due to its fast response, privacy preservation, and robust operation regardless of Internet connectivi... |
| VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference | Zihan Liu, Xinhao Luo, Junxian Guo, Wentao Ni, Yangjie Zhou, Yue Guan, Cong Guo, Weihao Cui, Yu Feng, Minyi Guo, Yuhao Zhu, Minjia Zhang, Jingwen Leng, Chen Jin | 2025-03-04 | 下载 | In this work, we design and implement VQ-LLM, an efficient fused Vector Quantization (VQ) kernel generation framework. We first introduce a software abstraction called codebook cache to optimize codeb... |
| A Distributed Partitioning Software and its Applications | Aparna Sasidharan | 2025-03-04 | 下载 | This article describes a geometric partitioning software that can be used for quick computation of data partitions on many-core HPC machines. It is most suited for dynamic applications with load distr... |
| Relaxation for Efficient Asynchronous Queues | Samuel Baldwin, Cole Hausman, Mohamed Bakr, Edward Talmage | 2025-03-04 | 下载 | We explore the problem of efficiently implementing shared data structures in an asynchronous computing environment. We start with a traditional FIFO queue, showing that full replication is possible wi... |
| AugFL: Augmenting Federated Learning with Pretrained Models | Sheng Yue, Zerui Qin, Yongheng Deng, Ju Ren, Yaoxue Zhang, Junshan Zhang | 2025-03-04 | 下载 | Federated Learning (FL) has garnered widespread interest in recent years. However, owing to strict privacy policies or limited storage capacities of training participants such as IoT devices, its effe... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Generative Active Adaptation for Drifting and Imbalanced Network Intrusion Detection | Ragini Gupta, Shinan Liu, Ruixiao Zhang, Xinyue Hu, Xiaoyang Wang, Hadjer Benkraouda, Pranav Kommaraju, Phuong Cao, Nick Feamster, Klara Nahrstedt | 2025-03-04 | 下载 | Machine learning has shown promise in network intrusion detection systems, yet its performance often degrades due to concept drift and imbalanced data. |
| Scaling IP Lookup to Large Databases using the CRAM Lens | Robert Chang, Pradeep Dogga, Andy Fingerhut, Victor Rios, George Varghese | 2025-03-04 | 下载 | Wide-area scaling trends require new approaches to Internet Protocol (IP) lookup, enabled by modern networking chips such as Intel Tofino, AMD Pensando, and Nvidia BlueField, which provide substantial... |
| Efficient and Optimal No-Regret Caching under Partial Observation | Younes Ben Mazziane, Francescomaria Faticanti, Sara Alouf, Giovanni Neglia | 2025-03-04 | 下载 | Online learning algorithms have been successfully used to design caching policies with sublinear regret in the total number of requests, with no statistical assumption about the request sequence. |
| Network Simulator-centric Compositional Testing | Tom Rousseaux, Christophe Crochet, John Aoga, Axel Legay | 2025-03-04 | 下载 | This article introduces a novel methodology, Network Simulator-centric Compositional Testing (NSCT), to enhance the verification of network protocols with a particular focus on time-varying network pr... |
| PANTHER: Pluginizable Testing Environment for Network Protocols | Christophe Crochet, John Aoga, Axel Legay | 2025-03-04 | 下载 | In this paper, we introduce PANTHER, a modular framework for testing network protocols and formally verifying their specification. The framework incorporates a plugin architecture to enhance flexibili... |
| Real-Time Burst-Mode Digital Signal Processing for Passive Optical Networks | Ji Zhou, Kainan Wu, Haide Wang, Jinyang Yang, Weiping Liu, Junwen Zhang, Changyuan Yu, Xiangjun Xin, Liangchuan Li | 2025-03-04 | 下载 | Driven by the ever-increasing capacity demands, the 50G passive optical network (PON) is maturing gradually. One of the main challenges for the 50G PON is implementing burst-mode digital signal proces... |
| MobRFFI: Non-cooperative Device Re-identification for Mobility Intelligence | Stepan Mazokha, Fanchen Bao, George Sklivanitis, Jason O. Hallstrom | 2025-03-04 | 下载 | WiFi-based mobility monitoring in urban environments can provide valuable insights into pedestrian and vehicle movements. However, MAC address randomization introduces a significant obstacle in accura... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| FlexInfer: Breaking Memory Constraint via Flexible and Efficient Offloading for On-Device LLM Inference | Hongchao Du, Shangyu Wu, Arina Kharlamova, Nan Guan, Chun Jason Xue | 2025-03-04 | 下载 | Large Language Models (LLMs) face challenges for on-device inference due to high memory demands. Traditional methods to reduce memory usage often compromise performance and lack adaptability. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Heavy-traffic Optimality of Skip-the-Longest-Queues in Heterogeneous Service Systems | Yishun Luo, Martin Zubeldia | 2025-03-04 | 下载 | We consider a discrete-time parallel service system consisting of heterogeneous single server queues with infinite capacity. Jobs arrive to the system as an i.i.d. |
| Energy efficiency of cache eviction algorithms for Zipf distributed objects | Emese Sziklay, Tamás Jursonovics | 2025-03-04 | 下载 | This paper presents a summary analysis of the Least Frequently Used (LFU) and Perfect Least Frequently Used (PLFU) cache eviction algorithms on real data, transferred on Content Delivery Nettworks (CD... |
| CoServe: Efficient Collaboration-of-Experts (CoE) Model Inference with Limited Memory | Jiashun Suo, Xiaojian Liao, Limin Xiao, Li Ruan, Jinquan Wang, Xiao Su, Zhisheng Huo | 2025-03-04 | 下载 | Large language models like GPT-4 are resource-intensive, but recent advancements suggest that smaller, specialized experts can outperform the monolithic models on specific tasks. |