Skip to content

2025-09-03

cs.AR - Architecture

标题作者发布日期PDF摘要
HEEPidermis: a versatile SoC for BioZ recordingJuan Sapriza, Beatrice Grassano, Alessio Naclerio, Filippo Quadri, Tommaso Terzano, David Mallasén, Davide Schiavone, Robin Leplae, Jérémie Moullet, Alexandre Levisse, Christoph Müller, Mariagrazia Graziano, Matías Miguez, David Atienza2025-09-03下载Biological impedance (BioZ) is an information-packed modality that allows for non-invasive monitoring of health and emotional state. Currently, most research involving tissue impedance is based on bul...
TRACE: Unlocking Effective CXL Bandwidth via Lossless Compression and Precision ScalingRui Xie, Asad Ul Haq, Yunhua Fang, Linsen Ma, Zirak Burzin Engineer, Liu Liu, Tong Zhang2025-09-03下载LLM inference is increasingly limited by memory bandwidth, and the bottleneck worsens at long context as the KV cache grows. CXL memory adds capacity to offload weights and KV, but its link and device...
CapsBeam: Accelerating Capsule Network based Beamformer for Ultrasound Non-Steered Plane Wave Imaging on Field Programmable Gate ArrayAbdul Rahoof, Vivek Chaturvedi, Mahesh Raveendranatha Panicker, Muhammad Shafique2025-09-03下载In recent years, there has been a growing trend in accelerating computationally complex non-real-time beamforming algorithms in ultrasound imaging using deep learning models.
FastCaps: A Design Methodology for Accelerating Capsule Network on Field Programmable Gate ArraysAbdul Rahoof, Vivek Chaturvedi, Muhammad Shafique2025-09-03下载Capsule Network (CapsNet) has shown significant improvement in understanding the variation in images along with better generalization ability compared to traditional Convolutional Neural Network (CNN)...
Generalized Methodology for Determining Numerical Features of Hardware Floating-Point Matrix Multipliers: Part IFaizan A Khattak, Mantas Mikaitis2025-09-03下载Numerical features of matrix multiplier hardware units in NVIDIA and AMD data centre GPUs have recently been studied. Features such as rounding, normalisation, and internal precision of the accumulato...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Distributed Download from an External Data Source in Asynchronous Faulty SettingsJohn Augustine, Soumyottam Chatterjee, Valerie King, Manish Kumar, Shachar Meir, David Peleg2025-09-03下载The distributedData Retrieval (DR) model consists of kk peers connected by a complete peer-to-peer communication network, and a trusted external data source that stores an array \textbf{X} of nn b...
Semi-decentralized Federated Time Series Prediction with Client Availability BudgetsYunkai Bao, Reza Safarzadeh, Xin Wang, Steve Drew2025-09-03下载Federated learning (FL) effectively promotes collaborative training among distributed clients with privacy considerations in the Internet of Things (IoT) scenarios.
Combining Performance and Productivity: Accelerating the Network Sensing Graph Challenge with GPUs and Commodity Data Science SoftwareSiddharth Samsi, Dan Campbell, Emanuel Scoullos, Oded Green2025-09-03下载The HPEC Graph Challenge is a collection of benchmarks representing complex workloads that test the hardware and software components of HPC systems, which traditional benchmarks, such as LINPACK, do n...
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization SchedulingYubo Gao, Renbo Tu, Gennady Pekhimenko, Nandita Vijaykumar2025-09-03下载Differentially-Private SGD (DP-SGD) is a powerful technique to protect user privacy when using sensitive data to train neural networks. During training, converting model weights and activations into l...
CloudFormer: An Attention-based Performance Prediction for Public Clouds with Unknown WorkloadAmirhossein Shahbazinia, Darong Huang, Luis Costero, David Atienza2025-09-03下载Cloud platforms are increasingly relied upon to host diverse, resource-intensive workloads due to their scalability, flexibility, and cost-efficiency.
Efficient and Secure Sleepy Model for BFT ConsensusPengkun Ren, Hai Dong, Zahir Tari, Pengcheng Zhang2025-09-03下载Byzantine Fault Tolerant (BFT) consensus protocols for dynamically available systems face a critical challenge: balancing latency and security in fluctuating node participation.
The High Cost of Keeping Warm: Characterizing Overhead in Serverless Autoscaling PoliciesLeonid Kondrashov, Boxi Zhou, Hancheng Wang, Dmitrii Ustiugov2025-09-03下载Serverless computing is transforming cloud application development, but the performance-cost trade-offs of control plane designs remain poorly understood due to a lack of open, cross-platform benchmar...
A description of the radio astronomy data processing tool DDF PipelineMathis Certenais, François Bodin, Laurent Morin2025-09-03下载This paper presents the DDF Pipeline, a radio astronomy data processing tool initially designed for the LOw-Frequency ARray (LO- FAR) radio-telescope and a candidate for processing data from the Squar...
FlashRecovery: Fast and Low-Cost Recovery from Failures for Large-Scale Training of LLMsHaijun Zhang, Jinxiang Wang, Zhenhua Yu, Yanyong Zhang, Xuejie Ji, Kaining Mao, Jun Zhang, Yaqing Zhang, Ting Wu, Fei Jie, Xiemin Huang, Zhifang Cai, Junhua Cheng, Shuwei Wang, Wei Li, Xiaoming Bao, Hua Xu, Shixiong Zhao, Jun Li, Hongwei Sun, Ziyang Zhang, Yi Xiong, Chunsheng Li2025-09-03下载Large language models (LLMs) have made a profound impact across various fields due to their advanced capabilities. However, training these models at unprecedented scales requires extensive AI accelera...
Mycroft: Tracing Dependencies in Collective Communication Towards Reliable LLM TrainingYangtao Deng, Lei Zhang, Qinlong Wang, Xiaoyun Zhi, Xinlei Zhang, Zhuo Jiang, Haohan Xu, Lei Wang, Zuquan Song, Gaohong Liu, Yang Bai, Shuguang Wang, Wencong Xiao, Jianxi Ye, Minlan Yu, Hong Xu2025-09-03下载Reliability is essential for ensuring efficiency in LLM training. However, many real-world reliability issues remain difficult to resolve, resulting in wasted resources and degraded model performance.
Treasure Hunt in Anonymous Graphs with Quantum Pebbles by Oblivious AgentsGaurav Gaur, Barun Gorain, Rishi Ranjan Singh, Daya Gaur2025-09-03下载We investigate the problem of finding a static treasure in anonymous graphs using oblivious agents and introduce a novel approach that leverages quantum information.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Drift Plus Optimistic Penalty: A Learning Framework for Stochastic Network Optimization with Improved Regret BoundsSathwik Chadaga, Eytan Modiano2025-09-03下载We consider the problem of joint routing and scheduling in queueing networks, where the edge transmission costs are unknown. At each time-slot, the network controller receives noisy observations of tr...
Entanglement Purification With Finite Latency Classical Communication in Quantum NetworksVivek Vasan, Alexander Nico-Katz, Boulat A. Bash, Daniel C. Kilper, Marco Ruffini2025-09-03下载Quantum networks rely on high fidelity entangled pairs distributed to nodes, but maintaining their fidelity is challenged by environmental decoherence during storage.
Hierarchical Low-Altitude Wireless Network Empowered Air Traffic ManagementZiye Jia, Jia He, Yuanhao Cui, Qiuming Zhu, Ligang Yuan, Fuhui Zhou, Qihui Wu, Dusit Niyato, Zhu Han2025-09-03下载As the increasing development of low-altitude aircrafts, the rational design of low-altitude networks directly impacts the aerial safety and resource utilization.
Dependency Chain Analysis of ROS 2 DDS QoS Policies: From Lifecycle Tutorial to Static VerificationSanghoon Lee, Junha Kang, Kyung-Joon Park2025-09-03下载Robot Operating System 2 (ROS 2) relies on the Data Distribution Service (DDS), which offers more than 20 Quality of Service (QoS) policies governing availability, reliability, and resource usage.
Machine Learning-Driven Anomaly Detection for 5G O-RAN Performance MetricsBabak Azkaei, Kishor Chandra Joshi, George Exarchakos2025-09-03下载The ever-increasing reliance of critical services on network infrastructure coupled with the increased operational complexity of beyond-5G/6G networks necessitate the need for proactive and automated ...
Multi-layer Digital Twin System for Future Mobile MetaverseGaosheng Zhao, Dong In Kim2025-09-03下载In the upcoming 6G era, the communication networks are expected to face unprecedented challenges in terms of complexity and dynamics. Digital Twin (DT) technology, with its various digital capabilitie...
Closing the Visibility Gap: A Monitoring Framework for Verifiable Open RAN OperationsHexuan Yu, Md Mohaimin Al Barat, Yang Xiao, Y. Thomas Hou, Wenjing Lou2025-09-03下载Open Radio Access Network (Open RAN) is reshaping mobile network architecture by promoting openness, disaggregation, and cross-vendor interoperability.

cs.PF - Performance

标题作者发布日期PDF摘要
CloudFormer: An Attention-based Performance Prediction for Public Clouds with Unknown WorkloadAmirhossein Shahbazinia, Darong Huang, Luis Costero, David Atienza2025-09-03下载Cloud platforms are increasingly relied upon to host diverse, resource-intensive workloads due to their scalability, flexibility, and cost-efficiency.
Estudio de la eficiencia en la escalabilidad de GPUs para el entrenamiento de Inteligencia ArtificialDavid Cortes, Carlos Juiz, Belen Bermejo2025-09-03下载Training large-scale deep learning models has become a key challenge for the scientific community and industry. While the massive use of GPUs can significantly speed up training times, this approach h...

基于 VitePress 构建