Appearance
2025-12-05
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| SparsePixels: Efficient Convolution for Sparse Data on FPGAs | Ho Fung Tsoi, Dylan Rankin, Vladimir Loncar, Philip Harris | 2025-12-05 | 下载 | Inference of standard convolutional neural networks (CNNs) on FPGAs often incurs high latency and a long initiation interval due to the deep nested loops required to densely convolve every input pixel... |
| From PyTorch to Calyx: An Open-Source Compiler Toolchain for ML Accelerators | Jiahan Xie, Evan Williams, Adrian Sampson | 2025-12-05 | 下载 | We present an end-to-end open-source compiler toolchain that targets synthesizable SystemVerilog from ML models written in PyTorch. Our toolchain leverages the accelerator design language Allo, the ha... |
| Hardware Software Optimizations for Fast Model Recovery on Reconfigurable Architectures | Bin Xu, Ayan Banerjee, Sandeep Gupta | 2025-12-05 | 下载 | Model Recovery (MR) is a core primitive for physical AI and real-time digital twins, but GPUs often execute MR inefficiently due to iterative dependencies, kernel-launch overheads, underutilized memor... |
| Mapping Space Exploration for Multi-Chiplet Accelerators Targeting LLM Inference Serving Workloads | Boyu Li, Zongwei Zhu, Yi Xiong, Qianyue Cao, Jiawei Geng, Xiaonan Zhang, Xi Li | 2025-12-05 | 下载 | Large Language Models (LLMs) impose massive computational demands, driving the need for scalable multi-chiplet accelerators. However, existing mapping space exploration efforts for such accelerators p... |
| ChipMind: Retrieval-Augmented Reasoning for Long-Context Circuit Design Specifications | Changwen Xing, SamZaak Wong, Xinlai Wan, Yanfeng Lu, Mengli Zhang, Zebin Ma, Lei Qi, Zhengxiong Li, Nan Guan, Zhe Jiang, Xi Wang, Jun Yang | 2025-12-05 | 下载 | While Large Language Models (LLMs) demonstrate immense potential for automating integrated circuit (IC) development, their practical deployment is fundamentally limited by restricted context windows. |
| First Demonstration of Second-order Training of Deep Neural Networks with In-memory Analog Matrix Computing | Saitao Zhang, Yubiao Luo, Shiqing Wang, Pushen Zuo, Yongxiang Li, Lunshuai Pan, Zheng Miao, Zhong Sun | 2025-12-05 | 下载 | Second-order optimization methods, which leverage curvature information, offer faster and more stable convergence than first-order methods such as stochastic gradient descent (SGD) and Adam. |
| When Forgetting Builds Reliability: LLM Unlearning for Reliable Hardware Code Generation | Yiwen Liang, Qiufeng Li, Shikai Wang, Weidong Cao | 2025-12-05 | 下载 | Large Language Models (LLMs) have shown strong potential in accelerating digital hardware design through automated code generation. Yet, ensuring their reliability remains a critical challenge, as exi... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Evaluation Framework for Centralized and Decentralized Aggregation Algorithm in Federated Systems | Sumit Chongder | 2025-12-05 | 下载 | In recent years, the landscape of federated learning has witnessed significant advancements, particularly in decentralized methodologies. This research paper presents a comprehensive comparison of Cen... |
| Metronome: Differentiated Delay Scheduling for Serverless Functions | Zhuangbin Chen, Juzheng Zheng, Zibin Zheng | 2025-12-05 | 下载 | Function-as-a-Service (FaaS) computing is an emerging cloud computing paradigm for its ease-of-management and elasticity. However, optimizing scheduling for serverless functions remains challenging du... |
| Are Bus-Mounted Edge Servers Feasible? | Xuezhi Li, Jiancong He, Ming Xie, Xuyang Chen, Le Chang, Li Jiang, Gui Gui | 2025-12-05 | 下载 | Placement of edge servers is the prerequisite of provisioning edge computing services for Internet of Vehicles (IoV). Fixed-site edge servers at Road Side Units (RSUs) or base stations are able to off... |
| Compiler-supported reduced precision and AoS-SoA transformations for heterogeneous hardware | Pawel K. Radtke, Tobias Weinzierl | 2025-12-05 | 下载 | This study evaluates AoS-to-SoA transformations over reduced-precision data layouts for a particle simulation code on several GPU platforms: We hypothesize that SoA fits particularly well to SIMT, whi... |
| Model Gateway: Model Management Platform for Model-Driven Drug Discovery | Yan-Shiun Wu, Nathan A. Morin | 2025-12-05 | 下载 | This paper presents the Model Gateway, a management platform for managing machine learning (ML) and scientific computational models in the drug discovery pipeline. |
| FedGMR: Federated Learning with Gradual Model Restoration under Asynchrony and Model Heterogeneity | Chengjie Ma, Seungeun Oh, Jihong Park, Seong-Lyun Kim | 2025-12-05 | 下载 | Federated learning (FL) holds strong potential for distributed machine learning, but in heterogeneous environments, Bandwidth-Constrained Clients (BCCs) often struggle to participate effectively due t... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| AIMNET: An IoT-Empowered Digital Twin for Continuous Gas Emission Monitoring and Early Hazard Detection | Zifan Zhou, Xuan Wang, Yang Yan, Lkhanaajav Mijiddorj, Yu Ding, Tyler Beringer, Parisa Masnadi Khiabani, Wolfgang G. Jentner, Xiao-Ming Hu, Chenghao Wang, Bryan M. Carroll, Ming Xue, David Ebert, Bin Li, Binbin Weng | 2025-12-05 | 下载 | A Digital Twin (DT) framework to enhance carbon-based gas plume monitoring is critical for supporting timely and effective mitigation responses to environmental hazards such as industrial gas leaks, o... |
| Cross-Domain Elephant Flow Detection: A Unified Machine Learning Approach with Application-Aware and Security Features | Tabidah Usmani, Sara Zahid, Amna Javaid | 2025-12-05 | 下载 | Network traffic classification, particularly elephant flow detection, faces significant challenges when deployed across heterogeneous network environments. |
| AIORA: An AI-Native Multi-Stakeholder Orchestration Architecture for 6G Continuum | Nuria Molner, Luis Rosa, Fulvio Risso, Konstantinos Samdanis, David Artuñedo, Rob Smets, Tarik Taleb, David Gomez-Barquero | 2025-12-05 | 下载 | This paper elaborates on a novel AI-native architecture for emerging 6G systems harnessing open APIs, along with supporting mechanisms to empower intelligent and coordinated orchestration of edge-clou... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Compiling Away the Overhead of Race Detection | Alexey Paznikov, Andrey Kogutenko, Yaroslav Osipov, Michael Schwarz, Umang Mathur | 2025-12-05 | 下载 | Dynamic data race detectors are indispensable for flagging concurrency errors in software, but their high runtime overhead limits their adoption. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Dissecting Embedding Bag Performance in DLRM Inference | Chandrish Ambati, Jing Ding, Trung Diep | 2025-12-05 | 下载 | As the size of DLRMs gets larger, the models must be partitioned across multiple GPUs or nodes of GPUs due to the size limitation of total HBM memory that can be packaged in a GPU. |