Skip to content

2025-03-21

cs.AR - Architecture

标题作者发布日期PDF摘要
Improving Quantization with Post-Training Model ExpansionGiuseppe Franco, Pablo Monteagudo-Lago, Ian Colbert, Nicholas Fraser, Michaela Blott2025-03-21下载The size of a model has been a strong predictor of its quality, as well as its cost. As such, the trade-off between model cost and quality has been well-studied.
Register Dispersion: Reducing the Footprint of the Vector Register File in Vector Engines of Low-Cost RISC-V CPUsVasileios Titopoulos, George Alexakis, Kosmas Alexandridis, Chrysostomos Nicopoulos, Giorgos Dimitrakopoulos2025-03-21下载The deployment of Machine Learning (ML) applications at the edge on resource-constrained devices has accentuated the need for efficient ML processing on low-cost processors.
Achieving Dependability of AI Execution with Radiation Hardened ProcessorsCarlos Rafael Tordoya Taquichiri, Hans Dermot Doran, Pablo Ghiglino, Mandar Harshe2025-03-21下载The reliance on radiation-hardened hardware, essential for domains requiring high-dependability such as space, nuclear energy and medical applications, severely restricts the choice of components avai...
Arm DynamIQ Shared Unit and Real-Time: An Empirical EvaluationAshutosh Pradhan, Daniele Ottaviano, Yi Jiang, Haozheng Huang, Alexander Zuepke, Andrea Bastoni, Marco Caccamo2025-03-21下载The increasing complexity of embedded hardware platforms poses significant challenges for real-time workloads. Architectural features such as Intel RDT, Arm QoS, and Arm MPAM are either unavailable on...
Work-In-Progress: Accelerating Numpy With OpenBLAS For Open-Source RISC-V ChipsCyril Koenig, Enrico Zelioli, Frank K. Gürkaynak, Luca Benini2025-03-21下载RISC-V allows for building general-purpose computing platforms with programmable accelerators around a single open-source ISA. However, leveraging heterogeneous SoCs within high-level applications is ...
Fused-Tiled Layers: Minimizing Data Movement on RISC-V SoCs with Software-Managed CachesVictor J. B. Jung, Alessio Burrello, Francesco Conti, Luca Benini2025-03-21下载The success of DNNs and their high computational requirements pushed for large codesign efforts aiming at DNN acceleration. Since DNNs can be represented as static computational graphs, static memory ...
MemPool Flavors: Between Versatility and Specialization in a RISC-V Manycore ClusterSergio Mazzola, Yichao Zhang, Marco Bertuletti, Diyou Shen, Luca Benini2025-03-21下载As computational paradigms evolve, applications such as attention-based models, wireless telecommunications, and computer vision impose increasingly challenging requirements on computer architectures:...
On-Sensor Convolutional Neural Networks with Early-ExitsHazem Hesham Yousef Shalby, Arianna De Vecchi, Alice Scandelli, Pietro Bartoli, Diana Trojaniello, Manuel Roveri, Federica Villa2025-03-21下载Tiny Machine Learning (TinyML) is a novel research field aiming at integrating Machine Learning (ML) within embedded devices with limited memory, computation, and energy.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Intelligent Resource Allocation Optimization for Cloud Computing via Machine LearningYuqing Wang, Xiao Yang2025-03-21下载With the rapid expansion of cloud computing applications, optimizing resource allocation has become crucial for improving system performance and cost efficiency.
Serinv: A Scalable Library for the Selected Inversion of Block-Tridiagonal with Arrowhead MatricesVincent Maillou, Lisa Gaedke-Merzhaeuser, Alexandros Nikolaos Ziogas, Olaf Schenk, Mathieu Luisier2025-03-21下载The inversion of structured sparse matrices is a key but computationally and memory-intensive operation in many scientific applications. There are cases, however, where only particular entries of the ...
Energy Efficiency trends in HPC: what high-energy and astrophysicists need to knowEstela Suarez, Jorge Amaya, Martin Frank, Oliver Freyermuth, Maria Girone, Bartosz Kostrzewa, Susanne Pfalzner2025-03-21下载The growing energy demands of HPC systems have made energy efficiency a critical concern for system developers and operators. However, HPC users are generally less aware of how these energy concerns i...
Achieving Dependability of AI Execution with Radiation Hardened ProcessorsCarlos Rafael Tordoya Taquichiri, Hans Dermot Doran, Pablo Ghiglino, Mandar Harshe2025-03-21下载The reliance on radiation-hardened hardware, essential for domains requiring high-dependability such as space, nuclear energy and medical applications, severely restricts the choice of components avai...
Analyzing Performance Bottlenecks in Zero-Knowledge Proof Based Rollups on EthereumMd. Ahsan Habib2025-03-21下载Blockchain technology is rapidly evolving, with scalability remaining one of its most significant challenges. While various solutions have been proposed and continue to be developed, it is essential t...
LoGoFair: Post-Processing for Local and Global Fairness in Federated LearningLi Zhang, Chaochao Chen, Zhongxuan Han, Qiyong Zhong, Xiaolin Zheng2025-03-21下载Federated learning (FL) has garnered considerable interest for its capability to learn from decentralized data sources. Given the increasing application of FL in decision-making scenarios, addressing ...
Robustness of deep learning classification to adversarial input on GPUs: asynchronous parallel accumulation is a source of vulnerabilitySanjif Shanmugavelu, Mathieu Taillefumier, Christopher Culver, Vijay Ganesh, Oscar Hernandez, Ada Sedova2025-03-21下载The ability of machine learning (ML) classification models to resist small, targeted input perturbations -- known as adversarial attacks -- is a key measure of their safety and reliability.
MemPool Flavors: Between Versatility and Specialization in a RISC-V Manycore ClusterSergio Mazzola, Yichao Zhang, Marco Bertuletti, Diyou Shen, Luca Benini2025-03-21下载As computational paradigms evolve, applications such as attention-based models, wireless telecommunications, and computer vision impose increasingly challenging requirements on computer architectures:...
Improving the End-to-End Efficiency of Offline Inference for Multi-LLM Applications Based on Sampling and SimulationJingzhi Fang, Yanyan Shen, Yue Wang, Lei Chen2025-03-21下载As large language models (LLMs) have shown great success in many tasks, they are used in various applications. While a lot of works have focused on the efficiency of single-LLM application (e.g.
Federated Cross-Domain Click-Through Rate Prediction With Large Language Model AugmentationJiangcheng Qin, Xueyuan Zhang, Baisong Liu, Jiangbo Qian, Yangyang Wang2025-03-21下载Accurately predicting click-through rates (CTR) under stringent privacy constraints poses profound challenges, particularly when user-item interactions are sparse and fragmented across domains.
DeFT: Mitigating Data Dependencies for Flexible Communication Scheduling in Distributed TrainingLin Meng, Yuzhong Sun2025-03-21下载Communication scheduling aims to reduce communication bottlenecks in data parallel training (DP) by maximizing the overlap between computation and communication.
Local Ratio based Real-time Job Offloading and Resource Allocation in Mobile Edge ComputingChuanchao Gao, Arvind Easwaran2025-03-21下载Mobile Edge Computing (MEC) has emerged as a promising paradigm enabling vehicles to handle computation-intensive and time-sensitive applications for intelligent transportation.
CoBRA: A Universal Strategyproof Confirmation Protocol for Quorum-based Proof-of-Stake BlockchainsZeta Avarikioti, Eleftherios Kokoris Kogias, Ray Neiheiser, Christos Stefo2025-03-21下载The security of many Proof-of-Stake (PoS) payment systems relies on quorum-based State Machine Replication (SMR) protocols. While classical analyses assume purely Byzantine faults, real-world systems ...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
P4sim: Programming Protocol-independent Packet Processors in ns-3Mingyu Ma, Giang T. Nguyen2025-03-21下载Programmable data planes enable users to design data plane algorithms for network devices, providing extensive flexibility for network customization.
Commercial Dishes Can Be My Ladder: Sustainable and Collaborative Data Offloading in LEO Satellite NetworksYi Ching Chou, Long Chen, Hengzhi Wang, Feng Wang, Hao Fang, Haoyuan Zhao, Miao Zhang, Xiaoyi Fan, Jiangchuan Liu2025-03-21下载Low Earth Orbit (LEO) satellite networks, characterized by their high data throughput and low latency, have gained significant interest from both industry and academia.
Transfer Learning for EDFA Gain Modeling: A Semi-Supervised Approach Using Internal Amplifier FeaturesAgastya Raj, Dan Kilper, Marco Ruffini2025-03-21下载The gain spectrum of an Erbium-Doped Fiber Amplifier (EDFA) has a complex dependence on channel loading, pump power, and operating mode, making accurate modeling difficult to achieve.
Interference Identification in Multi-User Optical Spectrum as a Service using Convolutional Neural NetworksAgastya Raj, Zehao Wang, Frank Slyne, Tingjun Chen, Dan Kilper, Marco Ruffini2025-03-21下载We introduce a ML-based architecture for network operators to detect impairments from specific OSaaS users while blind to the users' internal spectrum details.
Multi-Span Optical Power Spectrum Evolution Modeling using ML-based Multi-Decoder Attention FrameworkAgastya Raj, Zehao Wang, Frank Slyne, Tingjun Chen, Dan Kilper, Marco Ruffini2025-03-21下载We implement a ML-based attention framework with component-specific decoders, improving optical power spectrum prediction in multi-span networks.
Demonstration of Cooperative Transport Interface over Open Source 7.2 split RAN and Virtualised Open PON NetworkMerim Dzaferagic, Kevin O'Sullivan, Bruce Richardson, Brendan Ryan, Niall Power, Robin Giller, Marco Ruffini2025-03-21下载We demonstrate end-to-end 5G Open RAN over PON using off-the-shelf open networking hardware and open source RAN software. The implementation of the Cooperative Transport Interface provides timely sync...
Governance of Ledger-Anchored Decentralized IdentifiersSandro Rodriguez Garzon, Carlo Segat, Axel Küpper2025-03-21下载A Decentralized Identifier (DID) empowers an entity to prove control over a unique and self-issued identifier without relying on any identity provider.
Joint Beamforming and Trajectory Optimization for Multi-UAV-Assisted Integrated Sensing and Communication SystemsYan Kyaw Tun, Nway Nway Ei, Sheikh Salman Hassan, Cedomir Stefanovic, Nguyen Van Huynh, Madyan Alsenwi, Choong Seon Hong2025-03-21下载In this paper, we investigate beamforming design and trajectory optimization for a multi-unmanned aerial vehicle (UAV)-assisted integrated sensing and communication (ISAC) system.
Indoor Localization Based on MSC MapŁukasz Kułacz, Adrian Kliks, Julius Ruseckas, Gediminas Molis2025-03-21下载In this short paper, we propose a technique for AI-based identification of modulation and coding schemes (MCS) in surrounding cellular signals.
Rotatable RIS-Assisted Edge Computing: Orientation, Task Offloading, and Resource OptimizationBin Li, Dongdong Yang, Lei Liu2025-03-21下载The rotatable reconfigurable intelligent surface (RIS) can enhance mobile edge computing (MEC) performance by optimizing its orientation to improve the gain of received and transmitted signals.
Betweenness Centrality Based Dynamic Source Routing for Flying Ad Hoc Networks in Marching FormationShaoshi Yang, Wei Zhao, Chu-Meng Wang, Wen-Yu Dong, Xiaojie Ju2025-03-21下载Designing high-performance routing protocols for flying ad hoc networks (FANETs) is challenging due to the diversity of applications and the dynamics of network topology.

cs.PF - Performance

标题作者发布日期PDF摘要
Serinv: A Scalable Library for the Selected Inversion of Block-Tridiagonal with Arrowhead MatricesVincent Maillou, Lisa Gaedke-Merzhaeuser, Alexandros Nikolaos Ziogas, Olaf Schenk, Mathieu Luisier2025-03-21下载The inversion of structured sparse matrices is a key but computationally and memory-intensive operation in many scientific applications. There are cases, however, where only particular entries of the ...
Arm DynamIQ Shared Unit and Real-Time: An Empirical EvaluationAshutosh Pradhan, Daniele Ottaviano, Yi Jiang, Haozheng Huang, Alexander Zuepke, Andrea Bastoni, Marco Caccamo2025-03-21下载The increasing complexity of embedded hardware platforms poses significant challenges for real-time workloads. Architectural features such as Intel RDT, Arm QoS, and Arm MPAM are either unavailable on...
V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V PlatformsJavier J. Poveda Rodrigo, Mohamed Amine Ahmdi, Alessio Burrello, Daniele Jahier Pagliari, Luca Benini2025-03-21下载The recent exponential growth of Large Language Models (LLMs) has relied on GPU-based systems. However, CPUs are emerging as a flexible and lower-cost alternative, especially when targeting inference ...

基于 VitePress 构建