Appearance
2025-09-03
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| HEEPidermis: a versatile SoC for BioZ recording | Juan Sapriza, Beatrice Grassano, Alessio Naclerio, Filippo Quadri, Tommaso Terzano, David Mallasén, Davide Schiavone, Robin Leplae, Jérémie Moullet, Alexandre Levisse, Christoph Müller, Mariagrazia Graziano, Matías Miguez, David Atienza | 2025-09-03 | 下载 | Biological impedance (BioZ) is an information-packed modality that allows for non-invasive monitoring of health and emotional state. Currently, most research involving tissue impedance is based on bul... |
| TRACE: Unlocking Effective CXL Bandwidth via Lossless Compression and Precision Scaling | Rui Xie, Asad Ul Haq, Yunhua Fang, Linsen Ma, Zirak Burzin Engineer, Liu Liu, Tong Zhang | 2025-09-03 | 下载 | LLM inference is increasingly limited by memory bandwidth, and the bottleneck worsens at long context as the KV cache grows. CXL memory adds capacity to offload weights and KV, but its link and device... |
| CapsBeam: Accelerating Capsule Network based Beamformer for Ultrasound Non-Steered Plane Wave Imaging on Field Programmable Gate Array | Abdul Rahoof, Vivek Chaturvedi, Mahesh Raveendranatha Panicker, Muhammad Shafique | 2025-09-03 | 下载 | In recent years, there has been a growing trend in accelerating computationally complex non-real-time beamforming algorithms in ultrasound imaging using deep learning models. |
| FastCaps: A Design Methodology for Accelerating Capsule Network on Field Programmable Gate Arrays | Abdul Rahoof, Vivek Chaturvedi, Muhammad Shafique | 2025-09-03 | 下载 | Capsule Network (CapsNet) has shown significant improvement in understanding the variation in images along with better generalization ability compared to traditional Convolutional Neural Network (CNN)... |
| Generalized Methodology for Determining Numerical Features of Hardware Floating-Point Matrix Multipliers: Part I | Faizan A Khattak, Mantas Mikaitis | 2025-09-03 | 下载 | Numerical features of matrix multiplier hardware units in NVIDIA and AMD data centre GPUs have recently been studied. Features such as rounding, normalisation, and internal precision of the accumulato... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Distributed Download from an External Data Source in Asynchronous Faulty Settings | John Augustine, Soumyottam Chatterjee, Valerie King, Manish Kumar, Shachar Meir, David Peleg | 2025-09-03 | 下载 | The distributedData Retrieval (DR) model consists of peers connected by a complete peer-to-peer communication network, and a trusted external data source that stores an array \textbf{X} of b... |
| Semi-decentralized Federated Time Series Prediction with Client Availability Budgets | Yunkai Bao, Reza Safarzadeh, Xin Wang, Steve Drew | 2025-09-03 | 下载 | Federated learning (FL) effectively promotes collaborative training among distributed clients with privacy considerations in the Internet of Things (IoT) scenarios. |
| Combining Performance and Productivity: Accelerating the Network Sensing Graph Challenge with GPUs and Commodity Data Science Software | Siddharth Samsi, Dan Campbell, Emanuel Scoullos, Oded Green | 2025-09-03 | 下载 | The HPEC Graph Challenge is a collection of benchmarks representing complex workloads that test the hardware and software components of HPC systems, which traditional benchmarks, such as LINPACK, do n... |
| DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling | Yubo Gao, Renbo Tu, Gennady Pekhimenko, Nandita Vijaykumar | 2025-09-03 | 下载 | Differentially-Private SGD (DP-SGD) is a powerful technique to protect user privacy when using sensitive data to train neural networks. During training, converting model weights and activations into l... |
| CloudFormer: An Attention-based Performance Prediction for Public Clouds with Unknown Workload | Amirhossein Shahbazinia, Darong Huang, Luis Costero, David Atienza | 2025-09-03 | 下载 | Cloud platforms are increasingly relied upon to host diverse, resource-intensive workloads due to their scalability, flexibility, and cost-efficiency. |
| Efficient and Secure Sleepy Model for BFT Consensus | Pengkun Ren, Hai Dong, Zahir Tari, Pengcheng Zhang | 2025-09-03 | 下载 | Byzantine Fault Tolerant (BFT) consensus protocols for dynamically available systems face a critical challenge: balancing latency and security in fluctuating node participation. |
| The High Cost of Keeping Warm: Characterizing Overhead in Serverless Autoscaling Policies | Leonid Kondrashov, Boxi Zhou, Hancheng Wang, Dmitrii Ustiugov | 2025-09-03 | 下载 | Serverless computing is transforming cloud application development, but the performance-cost trade-offs of control plane designs remain poorly understood due to a lack of open, cross-platform benchmar... |
| A description of the radio astronomy data processing tool DDF Pipeline | Mathis Certenais, François Bodin, Laurent Morin | 2025-09-03 | 下载 | This paper presents the DDF Pipeline, a radio astronomy data processing tool initially designed for the LOw-Frequency ARray (LO- FAR) radio-telescope and a candidate for processing data from the Squar... |
| FlashRecovery: Fast and Low-Cost Recovery from Failures for Large-Scale Training of LLMs | Haijun Zhang, Jinxiang Wang, Zhenhua Yu, Yanyong Zhang, Xuejie Ji, Kaining Mao, Jun Zhang, Yaqing Zhang, Ting Wu, Fei Jie, Xiemin Huang, Zhifang Cai, Junhua Cheng, Shuwei Wang, Wei Li, Xiaoming Bao, Hua Xu, Shixiong Zhao, Jun Li, Hongwei Sun, Ziyang Zhang, Yi Xiong, Chunsheng Li | 2025-09-03 | 下载 | Large language models (LLMs) have made a profound impact across various fields due to their advanced capabilities. However, training these models at unprecedented scales requires extensive AI accelera... |
| Mycroft: Tracing Dependencies in Collective Communication Towards Reliable LLM Training | Yangtao Deng, Lei Zhang, Qinlong Wang, Xiaoyun Zhi, Xinlei Zhang, Zhuo Jiang, Haohan Xu, Lei Wang, Zuquan Song, Gaohong Liu, Yang Bai, Shuguang Wang, Wencong Xiao, Jianxi Ye, Minlan Yu, Hong Xu | 2025-09-03 | 下载 | Reliability is essential for ensuring efficiency in LLM training. However, many real-world reliability issues remain difficult to resolve, resulting in wasted resources and degraded model performance. |
| Treasure Hunt in Anonymous Graphs with Quantum Pebbles by Oblivious Agents | Gaurav Gaur, Barun Gorain, Rishi Ranjan Singh, Daya Gaur | 2025-09-03 | 下载 | We investigate the problem of finding a static treasure in anonymous graphs using oblivious agents and introduce a novel approach that leverages quantum information. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Drift Plus Optimistic Penalty: A Learning Framework for Stochastic Network Optimization with Improved Regret Bounds | Sathwik Chadaga, Eytan Modiano | 2025-09-03 | 下载 | We consider the problem of joint routing and scheduling in queueing networks, where the edge transmission costs are unknown. At each time-slot, the network controller receives noisy observations of tr... |
| Entanglement Purification With Finite Latency Classical Communication in Quantum Networks | Vivek Vasan, Alexander Nico-Katz, Boulat A. Bash, Daniel C. Kilper, Marco Ruffini | 2025-09-03 | 下载 | Quantum networks rely on high fidelity entangled pairs distributed to nodes, but maintaining their fidelity is challenged by environmental decoherence during storage. |
| Hierarchical Low-Altitude Wireless Network Empowered Air Traffic Management | Ziye Jia, Jia He, Yuanhao Cui, Qiuming Zhu, Ligang Yuan, Fuhui Zhou, Qihui Wu, Dusit Niyato, Zhu Han | 2025-09-03 | 下载 | As the increasing development of low-altitude aircrafts, the rational design of low-altitude networks directly impacts the aerial safety and resource utilization. |
| Dependency Chain Analysis of ROS 2 DDS QoS Policies: From Lifecycle Tutorial to Static Verification | Sanghoon Lee, Junha Kang, Kyung-Joon Park | 2025-09-03 | 下载 | Robot Operating System 2 (ROS 2) relies on the Data Distribution Service (DDS), which offers more than 20 Quality of Service (QoS) policies governing availability, reliability, and resource usage. |
| Machine Learning-Driven Anomaly Detection for 5G O-RAN Performance Metrics | Babak Azkaei, Kishor Chandra Joshi, George Exarchakos | 2025-09-03 | 下载 | The ever-increasing reliance of critical services on network infrastructure coupled with the increased operational complexity of beyond-5G/6G networks necessitate the need for proactive and automated ... |
| Multi-layer Digital Twin System for Future Mobile Metaverse | Gaosheng Zhao, Dong In Kim | 2025-09-03 | 下载 | In the upcoming 6G era, the communication networks are expected to face unprecedented challenges in terms of complexity and dynamics. Digital Twin (DT) technology, with its various digital capabilitie... |
| Closing the Visibility Gap: A Monitoring Framework for Verifiable Open RAN Operations | Hexuan Yu, Md Mohaimin Al Barat, Yang Xiao, Y. Thomas Hou, Wenjing Lou | 2025-09-03 | 下载 | Open Radio Access Network (Open RAN) is reshaping mobile network architecture by promoting openness, disaggregation, and cross-vendor interoperability. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| CloudFormer: An Attention-based Performance Prediction for Public Clouds with Unknown Workload | Amirhossein Shahbazinia, Darong Huang, Luis Costero, David Atienza | 2025-09-03 | 下载 | Cloud platforms are increasingly relied upon to host diverse, resource-intensive workloads due to their scalability, flexibility, and cost-efficiency. |
| Estudio de la eficiencia en la escalabilidad de GPUs para el entrenamiento de Inteligencia Artificial | David Cortes, Carlos Juiz, Belen Bermejo | 2025-09-03 | 下载 | Training large-scale deep learning models has become a key challenge for the scientific community and industry. While the massive use of GPUs can significantly speed up training times, this approach h... |