Appearance
2025-10-03
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Implémentation Efficiente de Fonctions de Convolution sur FPGA à l'Aide de Blocs Paramétrables et d'Approximations Polynomiales | Philippe Magalhães, Virginie Fresse, Benoît Suffran, Olivier Alata | 2025-10-03 | 下载 | Implementing convolutional neural networks (CNNs) on field-programmable gate arrays (FPGAs) has emerged as a promising alternative to GPUs, offering lower latency, greater power efficiency and greater... |
| A Resource-Driven Approach for Implementing CNNs on FPGAs Using Adaptive IPs | Philippe Magalhães, Virginie Fresse, Benoît Suffran, Olivier Alata | 2025-10-03 | 下载 | The increasing demand for real-time, low-latency artificial intelligence applications has propelled the use of Field-Programmable Gate Arrays (FPGAs) for Convolutional Neural Network (CNN) implementat... |
| A Hardware Accelerator for the Goemans-Williamson Algorithm | D. A. Herrera-Martí, E. Guthmuller, J. Fereyre | 2025-10-03 | 下载 | The combinatorial problem Max-Cut has become a benchmark in the evaluation of local search heuristics for both quantum and classical optimisers. |
| UPMEM Unleashed: Software Secrets for Speed | Krystian Chmielewski, Jarosław Ławnicki, Uladzislau Lukyanau, Tadeusz Kobus, Maciej Maciejewski | 2025-10-03 | 下载 | Developing kernels for Processing-In-Memory (PIM) platforms poses unique challenges in data management and parallel programming on limited processing units. |
| TeLLMe v2: An Efficient End-to-End Ternary LLM Prefill and Decode Accelerator with Table-Lookup Matmul on Edge FPGAs | Ye Qiao, Zhiheng Chen, Yifan Zhang, Yian Wang, Sitao Huang | 2025-10-03 | 下载 | With the emergence of wearable devices and other embedded systems, deploying large language models (LLMs) on edge platforms has become an urgent need. |
| HALO: Memory-Centric Heterogeneous Accelerator with 2.5D Integration for Low-Batch LLM Inference | Shubham Negi, Kaushik Roy | 2025-10-03 | 下载 | The rapid adoption of Large Language Models (LLMs) has driven a growing demand for efficient inference, particularly in latency-sensitive applications such as chatbots and personalized assistants. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Cosmological Hydrodynamics at Exascale: A Trillion-Particle Leap in Capability | Nicholas Frontiere, J. D. Emberson, Michael Buehlmann, Esteban M. Rangel, Salman Habib, Katrin Heitmann, Patricia Larsen, Vitali Morozov, Adrian Pope, Claude-André Faucher-Giguère, Antigoni Georgiadou, Damien Lebrun-Grandié, Andrey Prokopenko | 2025-10-03 | 下载 | Resolving the most fundamental questions in cosmology requires simulations that match the scale, fidelity, and physical complexity demanded by next-generation sky surveys. |
| A Lightweight Federated Learning Approach for Privacy-Preserving Botnet Detection in IoT | Taha M. Mahmoud, Naima Kaabouch | 2025-10-03 | 下载 | The rapid growth of the Internet of Things (IoT) has expanded opportunities for innovation but also increased exposure to botnet-driven cyberattacks. |
| Short-circuiting Rings for Low-Latency AllReduce | Sarah-Michelle Hammer, Stefan Schmid, Rachee Singh, Vamsi Addanki | 2025-10-03 | 下载 | Efficient collective communication is critical for many distributed ML and HPC applications. In this context, it is widely believed that the Ring algorithm for the AllReduce collective communication o... |
| Paris: A Decentralized Trained Open-Weight Diffusion Model | Zhiying Jiang, Raihan Seraj, Marcos Villagra, Bidhan Roy | 2025-10-03 | 下载 | We present Paris, the first publicly released diffusion model pre-trained entirely through decentralized computation. Paris demonstrates that high-quality text-to-image generation can be achieved with... |
| Sensors in viticulture: functions, benefits, and data-driven insights | Milan Milenkovic | 2025-10-03 | 下载 | Use of sensors and related analytical predictions can be a powerful tool in providing data-informed input to viticulturalists' decision process, complementing their vineyard observations and intuition... |
| iDDS: Intelligent Distributed Dispatch and Scheduling for Workflow Orchestration | Wen Guan, Tadashi Maeno, Aleksandr Alekseev, Fernando Harald Barreiro Megino, Kaushik De, Edward Karavakis, Alexei Klimentov, Tatiana Korchuganova, FaHui Lin, Paul Nilsson, Torre Wenaus, Zhaoyu Yang, Xin Zhao | 2025-10-03 | 下载 | The intelligent Distributed Dispatch and Scheduling (iDDS) service is a versatile workflow orchestration system designed for large-scale, distributed scientific computing. |
| PyRadiomics-cuda: 3D features extraction from medical images for HPC using GPU acceleration | Jakub Lisowski, Piotr Tyrakowski, Szymon Zyguła, Krzysztof Kaczmarski | 2025-10-03 | 下载 | PyRadiomics-cuda is a GPU-accelerated extension of the PyRadiomics library, designed to address the computational challenges of extracting three-dimensional shape features from medical images. |
| Energy Efficiency in Cloud-Based Big Data Processing for Earth Observation: Gap Analysis and Future Directions | Adhitya Bhawiyuga, Serkan Girgin, Rolf A. de By, Raul Zurita-Milla | 2025-10-03 | 下载 | Earth observation (EO) data volumes are rapidly increasing. While cloud computing are now used for processing large EO datasets, the energy efficiency aspects of such a processing have received much l... |
| On the energy efficiency of sparse matrix computations on multi-GPU clusters | Massimo Bernaschi, Alessandro Celestini, Pasqua D'Ambra, Giorgio Richelli | 2025-10-03 | 下载 | We investigate the energy efficiency of a library designed for parallel computations with sparse matrices. The library leverages high-performance, energy-efficient Graphics Processing Unit (GPU) accel... |
| Action Deviation-Aware Inference for Low-Latency Wireless Robots | Jeyoung Park, Yeonsub Lim, Seungeun Oh, Jihong Park, Jinho Choi, Seong-Lyun Kim | 2025-10-03 | 下载 | To support latency-sensitive AI applications ranging from autonomous driving to industrial robot manipulation, 6G envisions distributed ML with computational resources in mobile, edge, and cloud conne... |
| TridentServe: A Stage-level Serving System for Diffusion Pipelines | Yifei Xia, Fangcheng Fu, Hao Yuan, Hanke Zhang, Xupeng Miao, Yijun Liu, Suhan Ling, Jie Jiang, Bin Cui | 2025-10-03 | 下载 | Diffusion pipelines, renowned for their powerful visual generation capabilities, have seen widespread adoption in generative vision tasks (e.g., text-to-image/video). |
| Distributed Low-Communication Training with Decoupled Momentum Optimization | Sasho Nedelkoski, Alexander Acker, Odej Kao, Soeren Becker, Dominik Scheinert | 2025-10-03 | 下载 | The training of large models demands substantial computational resources, typically available only in data centers with high-bandwidth interconnects. |
| UPMEM Unleashed: Software Secrets for Speed | Krystian Chmielewski, Jarosław Ławnicki, Uladzislau Lukyanau, Tadeusz Kobus, Maciej Maciejewski | 2025-10-03 | 下载 | Developing kernels for Processing-In-Memory (PIM) platforms poses unique challenges in data management and parallel programming on limited processing units. |
| GRNND: A GPU-Parallel Relative NN-Descent Algorithm for Efficient Approximate Nearest Neighbor Graph Construction | Xiang Li, Qiong Chang, Yun Li, Jun Miyazaki | 2025-10-03 | 下载 | Relative Nearest Neighbor Descent (RNN-Descent) is a state-of-the-art algorithm for constructing sparse approximate nearest neighbor (ANN) graphs by combining the iterative refinement of NN-Descent wi... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| An efficient grey theory-driven path selection for energy efficiency control in the Internet of Things using fog and cloud computing | Mohammad Reza Akbari, Hamid Barati, Ali Barati | 2025-10-03 | 下载 | Due to the big data exchange on the Internet of Things, proper routing and selecting the best routes for fast data transmission improve network performance. |
| A distributed routing protocol for sending data from things to the cloud leveraging fog technology in the large-scale IoT ecosystem | Mohammad Reza Akbari, Hamid Barati, Ali Barati | 2025-10-03 | 下载 | Fog computing integrates cloud and edge resources. According to an intelligent and decentralized method, this technology processes data generated by IoT sensors to seamlessly integrate physical and cy... |
| Short-circuiting Rings for Low-Latency AllReduce | Sarah-Michelle Hammer, Stefan Schmid, Rachee Singh, Vamsi Addanki | 2025-10-03 | 下载 | Efficient collective communication is critical for many distributed ML and HPC applications. In this context, it is widely believed that the Ring algorithm for the AllReduce collective communication o... |
| Scalable Ground Station Selection for Large LEO Constellations | Grace Ra Kim, Duncan Eddy, Vedant Srinivas, Mykel J. Kochenderfer | 2025-10-03 | 下载 | Effective ground station selection is critical for low Earth orbiting (LEO) satellite constellations to minimize operational costs, maximize data downlink volume, and reduce communication gaps between... |
| Automatic Generation of Digital Twins for Network Testing | Shenjia Ding, David Flynn, Paul Harvey | 2025-10-03 | 下载 | The increased use of software in the operation and management of telecommunication networks has moved the industry one step closer to realizing autonomous network operation. |
| Corrosion Risk Estimation for Heritage Preservation: An Internet of Things and Machine Learning Approach Using Temperature and Humidity | Reginald Juan M. Mercado, Muhammad Kabeer, Haider Al-Obaidy, Rosdiadee Nordin | 2025-10-03 | 下载 | Proactive preservation of steel structures at culturally significant heritage sites like the San Sebastian Basilica in the Philippines requires accurate corrosion forecasting. |
| Sequence-Based Deep Learning for Handover Optimization in Dense Urban Cellular Network | Muhammad Kabeer, Rosdiadee Nordin, Mehran Behjati, Lau Sian Lun | 2025-10-03 | 下载 | Efficient handover management remains a critical challenge in dense urban cellular networks, where high cell density, user mobility, and diverse service demands increase the likelihood of unnecessary ... |
| SoK: Preconfirmations | Aikaterini-Panagiota Stouka, Conor McMenamin, Demetris Kyriacou, Lin Oshitani, Quentin Botha | 2025-10-03 | 下载 | In recent years, significant research efforts have focused on improving blockchain throughput and confirmation speeds without compromising security. |
| DH-EAC: Design of a Dynamic, Hierarchical Entanglement Access Control Protocol | Akihisa Takahashi, Yoshito Tobe | 2025-10-03 | 下载 | We propose Dynamic, Hierarchical Entanglement Access Control (DH-EAC), a pure-quantum protocol for fair and anonymous allocation of scarce entanglement across wide-area quantum networks composed of ma... |
| FSMA: Scalable and Reliable LoRa for Non-Terrestrial Networks with Mobile Gateways | Rohith Reddy Vennam, Maiyun Zhang, Raghav Subbaraman, Deepak Vashist, Dinesh Bharadia | 2025-10-03 | 下载 | The proliferation of Low Earth Orbit (LEO) satellites for universal IoT applications and the growing use of drones in emergency services, agriculture, and military operations highlight the transformat... |
| L4Span: Spanning Congestion Signaling over NextG Networks for Interactive Applications | Haoran Wan, Kyle Jamieson | 2025-10-03 | 下载 | Design for low latency networking is essential for tomorrow's interactive applications, but it is essential to deploy incrementally and universally at the network's last mile. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Cosmological Hydrodynamics at Exascale: A Trillion-Particle Leap in Capability | Nicholas Frontiere, J. D. Emberson, Michael Buehlmann, Esteban M. Rangel, Salman Habib, Katrin Heitmann, Patricia Larsen, Vitali Morozov, Adrian Pope, Claude-André Faucher-Giguère, Antigoni Georgiadou, Damien Lebrun-Grandié, Andrey Prokopenko | 2025-10-03 | 下载 | Resolving the most fundamental questions in cosmology requires simulations that match the scale, fidelity, and physical complexity demanded by next-generation sky surveys. |
| Formal Analysis of Metastable Failures in Software Systems | Peter Alvaro, Rebecca Isaacs, Rupak Majumdar, Kiran-Kumar Muniswamy-Reddy, Mahmoud Salamati, Sadegh Soudjani | 2025-10-03 | 下载 | Many large-scale software systems demonstrate metastable failures. In this class of failures, a stressor such as a temporary spike in workload causes the system performance to drop and, subsequently, ... |
| On the energy efficiency of sparse matrix computations on multi-GPU clusters | Massimo Bernaschi, Alessandro Celestini, Pasqua D'Ambra, Giorgio Richelli | 2025-10-03 | 下载 | We investigate the energy efficiency of a library designed for parallel computations with sparse matrices. The library leverages high-performance, energy-efficient Graphics Processing Unit (GPU) accel... |
| UPMEM Unleashed: Software Secrets for Speed | Krystian Chmielewski, Jarosław Ławnicki, Uladzislau Lukyanau, Tadeusz Kobus, Maciej Maciejewski | 2025-10-03 | 下载 | Developing kernels for Processing-In-Memory (PIM) platforms poses unique challenges in data management and parallel programming on limited processing units. |