Appearance
2025-05-20
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Hybrid SLC-MLC RRAM Mixed-Signal Processing-in-Memory Architecture for Transformer Acceleration via Gradient Redistribution | Chang Eun Song, Priyansh Bhatnagar, Zihan Xia, Nam Sung Kim, Tajana Rosing, Mingu Kang | 2025-05-20 | 下载 | Transformers, while revolutionary, face challenges due to their demanding computational cost and large data movement. To address this, we propose HyFlexPIM, a novel mixed-signal processing-in-memory (... |
| SecCAN: An Extended CAN Controller with Embedded Intrusion Detection | Shashwat Khandelwal, Shreejith Shanker | 2025-05-20 | 下载 | Recent research has highlighted the vulnerability of in-vehicle network protocols such as controller area networks (CAN) and proposed machine learning-based intrusion detection systems (IDSs) as an ef... |
| RISC-Q: A Generator for Real-Time Quantum Control System-on-Chips Compatible with RISC-V | Junyi Liu, Yi Lee, Haowei Deng, Connor Clayton, Gengzhi Yang, Xiaodi Wu | 2025-05-20 | 下载 | Quantum computing imposes stringent requirements for the precise control of large-scale qubit systems, including, for example, microsecond-latency feedback and nanosecond-precision timing of gigahertz... |
| CRYPTONITE: Scalable Accelerator Design for Cryptographic Primitives and Algorithms | Karthikeya Sharma Maheswaran, Camille Bossut, Andy Wanna, Qirun Zhang, Cong Hao | 2025-05-20 | 下载 | Cryptographic primitives, consisting of repetitive operations with different inputs, are typically implemented using straight-line C code due to traditional execution on CPUs. |
| Distributed quantum computing with black-box subroutines | X. Xu, Y. -D. Liu, S. Shi, Y. -J. Wang, D. -S. Wang | 2025-05-20 | 下载 | In this work, we propose a general protocol for distributed quantum computing that accommodates arbitrary unknown subroutines. It can be applied to scale up quantum computing through multi-chip interc... |
| Low-Cost FlashAttention with Fused Exponential and Multiplication Hardware Operators | Kosmas Alexandridis, Vasileios Titopoulos, Giorgos Dimitrakopoulos | 2025-05-20 | 下载 | Attention mechanisms, particularly within Transformer architectures and large language models (LLMs), have revolutionized sequence modeling in machine learning and artificial intelligence applications... |
| FLASH-D: FlashAttention with Hidden Softmax Division | Kosmas Alexandridis, Vasileios Titopoulos, Giorgos Dimitrakopoulos | 2025-05-20 | 下载 | The transformer's attention mechanism has revolutionized AI and machine learning, with its efficient computation being crucial to its performance. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Machine Learning for Consistency Violation Faults Analysis | Kamal Giri, Amit Garu | 2025-05-20 | 下载 | Distributed systems frequently encounter consistency violation faults (cvfs), where nodes operate on outdated or inaccurate data, adversely affecting convergence and overall system performance. |
| EcoLoRA: Communication-Efficient Federated Fine-Tuning of Large Language Models | Han Liu, Ruoyao Wen, Srijith Nair, Jia Liu, Wenjing Lou, Chongjie Zhang, William Yeoh, Yevgeniy Vorobeychik, Ning Zhang | 2025-05-20 | 下载 | To address data locality and privacy restrictions, Federated Learning (FL) has recently been adopted to fine-tune large language models (LLMs), enabling improved performance on various downstream task... |
| Sei Giga | Benjamin Marsh, Steven Landers, Jayendra Jog | 2025-05-20 | 下载 | We introduce the Sei Giga, a multi-concurrent producer parallelized execution EVM layer one blockchain. In an internal testnet Giga has achieved >5 gigagas/sec throughput and sub 400ms finality. |
| Balanced and Elastic End-to-end Training of Dynamic LLMs | Mohamed Wahib, Muhammed Abdullah Soyturk, Didem Unat | 2025-05-20 | 下载 | To reduce the computational and memory overhead of Large Language Models, various approaches have been proposed. These include a) Mixture of Experts (MoEs), where token routing affects compute balance... |
| Extracting Practical, Actionable Energy Insights from Supercomputer Telemetry and Logs | Melanie Cornelius, Greg Cross, Shilpika Shilpika, Matthew T. Dearing, Zhiling Lan | 2025-05-20 | 下载 | As supercomputers grow in size and complexity, power efficiency has become a critical challenge, particularly in understanding GPU power consumption within modern HPC workloads. |
| PSMOA: Policy Support Multi-Objective Optimization Algorithm for Decentralized Data Replication | Xi Wang, Susmit Shannigrahi | 2025-05-20 | 下载 | Efficient data replication in decentralized storage systems must account for diverse policies, especially in multi-organizational, data-intensive environments. |
| Distributed quantum computing with black-box subroutines | X. Xu, Y. -D. Liu, S. Shi, Y. -J. Wang, D. -S. Wang | 2025-05-20 | 下载 | In this work, we propose a general protocol for distributed quantum computing that accommodates arbitrary unknown subroutines. It can be applied to scale up quantum computing through multi-chip interc... |
| Federated prediction for scalable and privacy-preserved knowledge-based planning in radiotherapy | Jingyun Chen, David Horowitz, Yading Yuan | 2025-05-20 | 下载 | Background: Deep learning has potential to improve the efficiency and consistency of radiation therapy planning, but clinical adoption is hindered by the limited model generalizability due to data sca... |
| ServerlessLoRA: Minimizing Latency and Cost in Serverless Inference for LoRA-Based LLMs | Yifan Sui, Hao Wang, Hanfei Yu, Yitao Hu, Jianxun Li, Hao Wang | 2025-05-20 | 下载 | Serverless computing has grown rapidly for serving Large Language Model (LLM) inference due to its pay-as-you-go pricing, fine-grained GPU usage, and rapid scaling. |
| Evaluating the Impact Of Spatial Features Of Mobility Data and Index Choice On Database Performance | Tim C. Rese, Alexandra Kapp, David Bermbach | 2025-05-20 | 下载 | The growing number of moving Internet-of-Things (IoT) devices has led to a surge in moving object data, powering applications such as traffic routing, hotspot detection, or weather forecasting. |
| SkyMemory: A LEO Edge Cache for Transformer Inference Optimization and Scale Out | Thomas Sandholm, Sayandev Mukherjee, Lin Cheng, Bernardo A. Huberman | 2025-05-20 | 下载 | We expand the scope of cache memory to include LEO constellations, which are highly distributed systems with thousands of satellites connected with free-space optics inter-satellite links (ISL) always... |
| Co-LoRA: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients | Minhyuk Seo, Taeheon Kim, Hankook Lee, Jonghyun Choi, Tinne Tuytelaars | 2025-05-20 | 下载 | As AI becomes more personal, e.g., Agentic AI, there is an increasing need for personalizing models for various use cases. Personalized federated learning (PFL) enables each client to collaboratively ... |
| Prime Collective Communications Library -- Technical Report | Michael Keiblinger, Mario Sieg, Jack Min Ong, Sami Jaghouar, Johannes Hagemann | 2025-05-20 | 下载 | This report presents the Prime Collective Communications Library (PCCL), a novel fault-tolerant collective communication library designed for distributed ML workloads over the public internet. |
| FedGraM: Defending Against Untargeted Attacks in Federated Learning via Embedding Gram Matrix | Di Wu, Qian Li, Heng Yang, Yong Han | 2025-05-20 | 下载 | Federated Learning (FL) enables geographically distributed clients to collaboratively train machine learning models by sharing only their local models, ensuring data privacy. |
| Paradigm Shift in Infrastructure Inspection Technology: Leveraging High-performance Imaging and Advanced AI Analytics to Inspect Road Infrastructure | Du Wu, Enzhi Zhang, Isaac Lyngaas, Xiao Wang, Amir Ziabari, Tao Luo, Peng Chen, Kento Sato, Fumiyoshi Shoji, Takaki Hatsui, Kentaro Uesugi, Akira Seo, Yasuhito Sakai, Toshio Endo, Tetsuya Ishikawa, Satoshi Matsuoka, Mohamed Wahib | 2025-05-20 | 下载 | Effective road infrastructure management is crucial for modern society. Traditional manual inspection techniques remain constrained by cost, efficiency, and scalability, while camera and laser imaging... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| TSA-WF: Exploring the Effectiveness of Time Series Analysis for Website Fingerprinting | Michael Wrana, Uzma Maroof, Diogo Barradas | 2025-05-20 | 下载 | Website fingerprinting (WF) is a technique that allows an eavesdropper to determine the website a target user is accessing by inspecting the metadata associated with the packets she exchanges via some... |
| PSMOA: Policy Support Multi-Objective Optimization Algorithm for Decentralized Data Replication | Xi Wang, Susmit Shannigrahi | 2025-05-20 | 下载 | Efficient data replication in decentralized storage systems must account for diverse policies, especially in multi-organizational, data-intensive environments. |
| Automated, Cross-Layer Root Cause Analysis of 5G Video-Conferencing Quality Degradation | Fan Yi, Haoran Wan, Kyle Jamieson, Oliver Michel | 2025-05-20 | 下载 | 5G wireless networks are complex, leveraging layers of scheduling, retransmission, and adaptation mechanisms to maximize their efficiency. But these mechanisms interact to produce significant fluctuat... |
| A5/1 is in the Air: Passive Detection of 2G (GSM) Ciphering Algorithms | Matthias Koch, Christian Nettersheim, Thorsten Horstmann, Michael Rademacher | 2025-05-20 | 下载 | This paper investigates the ongoing use of the A5/1 ciphering algorithm within 2G GSM networks. Despite its known vulnerabilities and the gradual phasing out of GSM technology by some operators, GSM s... |
| open5Gcube: A Modular and Usable Framework for Mobile Network Laboratories | Thorsten Horstmann, Dominik Brunke, Tobias Kremeyer, Matthias Wilmes, Gunnar Schneider, Julian Sturm, Hartmut König, Michael Rademacher | 2025-05-20 | 下载 | In mobile network research, the integration of real-world components such as User Equipment (UE) with open-source network infrastructure is essential yet challenging. |
| Interpretable Reinforcement Learning for Load Balancing using Kolmogorov-Arnold Networks | Kamal Singh, Sami Marouani, Ahmad Al Sheikh, Pham Tran Anh Quang, Amaury Habrard | 2025-05-20 | 下载 | Reinforcement learning (RL) has been increasingly applied to network control problems, such as load balancing. However, existing RL approaches often suffer from lack of interpretability and difficulty... |
| Measuring Round-Trip Response Latencies Under Asymmetric Routing | Bhavana Vannarth Shobhana, Yen-lin Chien, Jonathan Diamant, Badri Nath, Shir Landau Feibish, Srinivas Narayana | 2025-05-20 | 下载 | Latency is a key indicator of Internet service performance. Continuously tracking the latency of client requests enables service operators to quickly identify bottlenecks, perform adaptive resource al... |
| Sibling Prefixes: Identifying Similarities in IPv4 and IPv6 Prefixes | Fariba Osali, Khwaja Zubair Sediqi, Oliver Gasser | 2025-05-20 | 下载 | Since the standardization of IPv6 in 1998, both versions of the Internet Protocol have coexisted in the Internet. Clients usually run algorithms such as Happy Eyeballs, to decide whether to connect to... |
| Integration of TinyML and LargeML: A Survey of 6G and Beyond | Thai-Hoc Vu, Ngo Hoang Tu, Thien Huynh-The, Kyungchun Lee, Sunghwan Kim, Miroslav Voznak, Quoc-Viet Pham | 2025-05-20 | 下载 | The evolution from fifth-generation (5G) to sixth-generation (6G) networks is driving an unprecedented demand for advanced machine learning (ML) solutions. |
| VaN3Twin: the Multi-Technology V2X Digital Twin with Ray-Tracing in the Loop | Roberto Pegurri, Diego Gasco, Francesco Linsalata, Marco Rapelli, Eugenio Moro, Francesco Raviglione, Claudio Casetti | 2025-05-20 | 下载 | This paper presents VaN3Twin-the first open-source, full-stack Network Digital Twin (NDT) framework for simulating the coexistence of multiple Vehicle-to-Everything (V2X) communication technologies wi... |
| CE-LSLM: Efficient Large-Small Language Model Inference and Communication via Cloud-Edge Collaboration | Pengyan Zhu, Tingting Yang | 2025-05-20 | 下载 | Emerging intelligent service scenarios in 6G communication impose stringent requirements for low latency, high reliability, and privacy preservation. |
| 6G communications through sub-Terahertz CMOS power amplifiers: Design challenges and trends | Jun Yan Lee, Duo Wu, Xuanrui Guo, Jian Ding Tan, Teh Jia Yew, Zi Neng Ng, Mohammad Arif Sobhan Bhuiyan, Mahdi H. Miraz | 2025-05-20 | 下载 | The fifth-generation (5G) network faces limitations in supporting emerging applications, such as artificial intelligence (AI), virtual reality (VR) and digital twins. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Extracting Practical, Actionable Energy Insights from Supercomputer Telemetry and Logs | Melanie Cornelius, Greg Cross, Shilpika Shilpika, Matthew T. Dearing, Zhiling Lan | 2025-05-20 | 下载 | As supercomputers grow in size and complexity, power efficiency has become a critical challenge, particularly in understanding GPU power consumption within modern HPC workloads. |
| Task-parallelism in SWIFT for heterogeneous compute architectures | Abouzied M. A. Nasar, Benedict D. Rogers, Georgios Fourtakas, Mladen Ivkovic, Tobias Weinzierl, Scott T. Kay, Matthieu Schaller | 2025-05-20 | 下载 | This paper highlights first steps towards enabling graphics processing unit (GPU) acceleration of the task-parallel smoothed particle hydrodynamics (SPH) solver SWIFT. |
| Heterogeneous Memory Pool Tuning | Filip Vaverka, Ondrej Vysocky, Lubomir Riha | 2025-05-20 | 下载 | We present a lightweight tool for the analysis and tuning of application data placement in systems with heterogeneous memory pools. The tool allows non-intrusively identifying, analyzing, and controll... |
| Towards Efficient Multi-Scale Deformable Attention on NPU | Chenghuan Huang, Zhigeng Xu, Chong Sun, Chen Li, Ziyang Ma | 2025-05-20 | 下载 | Multi-scale deformable attention (MSDA) is a flexible and powerful feature extraction mechanism for visual tasks, but its random-access grid sampling strategy poses significant optimization challenges... |