Skip to content

2025-04-04

cs.AR - Architecture

标题作者发布日期PDF摘要
A Reconfigurable Time-Domain In-Memory Computing Macro using FeFET-Based CAM with Multilevel Delay Calibration in 28 nm CMOSJeries Mattar, Mor M. Dahan, Stefan Dunkel, Halid Mulaosmanovic, Gunda Beernink, Sven Beyer, Eilam Yalon, Nicolás Wainstein2025-04-04下载Time-domain nonvolatile in-memory computing (TD-nvIMC) offers a promising pathway to reduce data movement and improve energy efficiency by encoding computation in delay rather than voltage or current.
RealProbe: An Automated and Lightweight Performance Profiler for In-FPGA Execution of High-Level Synthesis DesignsJiho Kim, Cong Hao2025-04-04下载High-level synthesis (HLS) accelerates FPGA design by rapidly generating diverse implementations using optimization directives. However, even with cycle-accurate C/RTL co-simulation, the reported cloc...
Performance Analysis of HPC applications on the Aurora Supercomputer: Exploring the Impact of HBM-Enabled Intel Xeon Max CPUsHuda Ibeid, Vikram Narayana, Jeongnim Kim, Anthony Nguyen, Vitali Morozov, Ye Luo2025-04-04下载The Aurora supercomputer is an exascale-class system designed to tackle some of the most demanding computational workloads. Equipped with both High Bandwidth Memory (HBM) and DDR memory, it provides u...
PHOENIX: Pauli-Based High-Level Optimization Engine for Instruction Execution on NISQ DevicesZhaohui Yang, Dawei Ding, Chenghong Zhu, Jianxin Chen, Yuan Xie2025-04-04下载Variational quantum algorithms (VQA) based on Hamiltonian simulation represent a specialized class of quantum programs well-suited for near-term quantum computing applications due to its modest resour...
NDFT: Accelerating Density Functional Theory Calculations via Hardware/Software Co-Design on Near-Data Computing SystemQingcai Jiang, Buxin Tu, Xiaoyu Hao, Junshi Chen, Hong An2025-04-04下载Linear-response time-dependent Density Functional Theory (LR-TDDFT) is a widely used method for accurately predicting the excited-state properties of physical systems.
Linear Decomposition of the Majority Boolean Function using the Ones on Smaller VariablesAnupam Chattopadhyay, Debjyoti Bhattacharjee, Subhamoy Maitra2025-04-04下载A long-investigated problem in circuit complexity theory is to decompose an nn-input or nn-variable Majority Boolean function (call it MnM_n) using kk-input ones (MkM_k), k<nk < n, where the objecti...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Secure Federated XGBoost with CUDA-accelerated Homomorphic Encryption via NVIDIA FLAREZiyue Xu, Yuan-Ting Hsieh, Zhihong Zhang, Holger R. Roth, Chester Chen, Yan Cheng, Andrew Feng2025-04-04下载Federated learning (FL) enables collaborative model training across decentralized datasets. NVIDIA FLARE's Federated XGBoost extends the popular XGBoost algorithm to both vertical and horizontal feder...
Accurate GPU Memory Prediction for Deep Learning Jobs through Dynamic AnalysisJiabo Shi, Yehia Elkhatib2025-04-04下载The benefits of Deep Learning (DL) impose significant pressure on GPU resources, particularly within GPU cluster, where Out-Of-Memory (OOM) errors present a primary impediment to model training and ef...
HeterMoE: Efficient Training of Mixture-of-Experts Models on Heterogeneous GPUsYongji Wu, Xueshen Liu, Shuowei Jin, Ceyu Xu, Feng Qian, Z. Morley Mao, Matthew Lentz, Danyang Zhuo, Ion Stoica2025-04-04下载The Mixture-of-Experts (MoE) architecture has become increasingly popular as a method to scale up large language models (LLMs). To save costs, heterogeneity-aware training solutions have been proposed...
Performance Analysis of HPC applications on the Aurora Supercomputer: Exploring the Impact of HBM-Enabled Intel Xeon Max CPUsHuda Ibeid, Vikram Narayana, Jeongnim Kim, Anthony Nguyen, Vitali Morozov, Ye Luo2025-04-04下载The Aurora supercomputer is an exascale-class system designed to tackle some of the most demanding computational workloads. Equipped with both High Bandwidth Memory (HBM) and DDR memory, it provides u...
An overview of the efficiency and censorship-resistance guarantees of widely-used consensus protocolsOrestis Alpos, Bernardo David, Nikolas Kamarinakis, Dionysis Zindros2025-04-04下载Censorship resistance with short-term inclusion guarantees is an important feature of decentralized systems, missing from many state-of-the-art and even deployed consensus protocols.
LLMSched: Uncertainty-Aware Workload Scheduling for Compound LLM ApplicationsBotao Zhu, Chen Chen, Xiaoyi Fan, Yifei Zhu2025-04-04下载Developing compound Large Language Model (LLM) applications is becoming an increasingly prevalent approach to solving real-world problems. In these applications, an LLM collaborates with various exter...
AeroDaaS: Towards an Application Programming Framework for Drones-as-a-ServiceSuman Raj, Rajdeep Singh, Kautuk Astu, Yogesh Simmhan2025-04-04下载The increasing adoption of UAVs with advanced sensors and GPU-accelerated edge computing has enabled real-time AI-driven applications in fields such as precision agriculture, wildfire monitoring, and ...
PPFPL: Cross-silo Privacy-preserving Federated Prototype Learning Against Data Poisoning AttacksHongliang Zhang, Jiguo Yu, Fenghua Xu, Chunqiang Hu, Yongzhao Zhang, Xiaofen Wang, Zhongyuan Yu, Xiaosong Zhang2025-04-04下载Privacy-Preserving Federated Learning (PPFL) enables multiple clients to collaboratively train models by submitting secreted model updates. Nonetheless, PPFL is vulnerable to data poisoning attacks ...
Extending Data Spatial Semantics for Scale Agnostic ProgrammingJason Mars2025-04-04下载We introduce extensions to Data Spatial Programming (DSP) that enable scale-agnostic programming for application development. Building on DSP's paradigm shift from data-to-compute to compute-to-data, ...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
CAMINO: Cloud-native Autonomous Management and Intent-based OrchestratorKonstantinos Antonakoglou, Ioannis Mavromatis, Saptarshi Ghosh, Mark Rouse, Konstantinos Katsaros2025-04-04下载This paper introduces CAMINO, a Cloud-native Autonomous Management and Intent-based Orchestrator designed to address the challenges of scalable, declarative, and cloud-native service management and or...
Optimistic Learning for Communication NetworksGeorge Iosifidis, Naram Mhaisen, Douglas J. Leith2025-04-04下载AI/ML-based tools are at the forefront of resource management solutions for communication networks. Deep learning, in particular, is highly effective in facilitating fast and high-performing decision-...
QuinID: Enabling FDMA-Based Fully Parallel RFID with Frequency-Selective AntennaXin Na, Jia Zhang, Jiacheng Zhang, Xiuzhen Guo, Yang Zou, Meng Jin, Yimiao Sun, Yunhao Liu, Yuan He2025-04-04下载Parallelizing passive Radio Frequency Identification (RFID) reading is an arguably crucial, yet unsolved challenge in modern IoT applications.
Leggiero: Analog WiFi Backscatter with Payload TransparencyXin Na, Xiuzhen Guo, Zihao Yu, Jia Zhang, Yuan He, Yunhao Liu2025-04-04下载Backscatter is an enabling technology for battery-free sensing in today's Artificial Intelligence of Things (AIOT). Building a backscatter-based sensing system, however, is a daunting task, due to two...
Offline and Distributional Reinforcement Learning for Wireless CommunicationsEslam Eldeeb, Hirley Alves2025-04-04下载The rapid growth of heterogeneous and massive wireless connectivity in 6G networks demands intelligent solutions to ensure scalability, reliability, privacy, ultra-low latency, and effective control.
Adaptive Movement Sampling Physics-Informed Residual Network (AM-PIRN) for Solving Nonlinear Option Pricing modelsQinjiao Gao, Zuowei Wang, Ran Zhang, Dongjiang Wang2025-04-04下载In this paper, we propose the Adaptive Movement Sampling Physics-Informed Residual Network (AM-PIRN) to address challenges in solving nonlinear option pricing PDE models, where solutions often exhibit...
Throughput-Optimal Random Access: A Queueing-Theoretical Analysis for Learning-Based Access DesignXinran Zhao, Lin Dai2025-04-04下载Random access networks have long been observed to suffer from low throughput if nodes' access strategy is not properly designed. To improve the throughput performance, learning-based approaches, with ...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Extending Data Spatial Semantics for Scale Agnostic ProgrammingJason Mars2025-04-04下载We introduce extensions to Data Spatial Programming (DSP) that enable scale-agnostic programming for application development. Building on DSP's paradigm shift from data-to-compute to compute-to-data, ...

cs.PF - Performance

标题作者发布日期PDF摘要
Performance Analysis of HPC applications on the Aurora Supercomputer: Exploring the Impact of HBM-Enabled Intel Xeon Max CPUsHuda Ibeid, Vikram Narayana, Jeongnim Kim, Anthony Nguyen, Vitali Morozov, Ye Luo2025-04-04下载The Aurora supercomputer is an exascale-class system designed to tackle some of the most demanding computational workloads. Equipped with both High Bandwidth Memory (HBM) and DDR memory, it provides u...
NeRFlex: Resource-aware Real-time High-quality Rendering of Complex Scenes on Mobile DevicesZhe Wang, Yifei Zhu2025-04-04下载Neural Radiance Fields (NeRF) is a cutting-edge neural network-based technique for novel view synthesis in 3D reconstruction. However, its significant computational demands pose challenges for deploym...

基于 VitePress 构建