Skip to content

2025-04-12

cs.AR - Architecture

标题作者发布日期PDF摘要
NetTAG: A Multimodal RTL-and-Layout-Aligned Netlist Foundation Model via Text-Attributed GraphWenji Fang, Wenkai Li, Shang Liu, Yao Lu, Hongce Zhang, Zhiyao Xie2025-04-12下载Circuit representation learning has shown promise in advancing Electronic Design Automation (EDA) by capturing structural and functional circuit properties for various tasks.
Revisiting Computational Storage for Data Integrity and SecurityChao Shi, Anthony Manschula, Tabassum Mahmud, Zeren Yang, Mai Zheng, Yong Chen, Jim Wayda, Matthew Wolf, Byungwoo Bang2025-04-12下载The idea of computational storage device (CSD) has come a long way since at least 1990s [1], [2]. By embedding computing resources within storage devices, CSDs could potentially offload computational ...
Scaling up Reversible Logic with HKI Superconducting InductorsErik P. DeBenedictis2025-04-12下载Researchers developed about a dozen semiconductor reversible (or adiabatic) logic chips since the early 1990s, validating circuit designs and proving the concept--but scale up required a further advan...
Leveraging Application-Specific Knowledge for Energy-Efficient Deep Learning Accelerators on Resource-Constrained FPGAsChao Qian2025-04-12下载The growing adoption of Deep Learning (DL) applications in the Internet of Things has increased the demand for energy-efficient accelerators. Field Programmable Gate Arrays (FPGAs) offer a promising p...
A Case for Kolmogorov-Arnold Networks in Prefetching: Towards Low-Latency, Generalizable ML-Based PrefetchersDhruv Kulkarni, Bharat Bhammar, Henil Thaker, Pranav Dhobi, R. P. Gohil, Sai Manoj Pudukotai Dinkarrao2025-04-12下载The memory wall problem arises due to the disparity between fast processors and slower memory, causing significant delays in data access, even more so on edge devices.
MGS: Markov Greedy Sums for Accurate Low-Bitwidth Floating-Point AccumulationVikas Natesh, H. T. Kung, David Kong2025-04-12下载We offer a novel approach, MGS (Markov Greedy Sums), to improve the accuracy of low-bitwidth floating-point dot products in neural network computations.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource ConstraintsYichao Yuan, Lin Ma, Nishil Talati2025-04-12下载Mixture of Experts (MoE) LLMs, characterized by their sparse activation patterns, offer a promising approach to scaling language models while avoiding proportionally increasing the inference cost.
Lumos: Efficient Performance Modeling and Estimation for Large-scale LLM TrainingMingyu Liang, Hiwot Tadese Kassa, Wenyin Fu, Brian Coutinho, Louis Feng, Christina Delimitrou2025-04-12下载Training LLMs in distributed environments presents significant challenges due to the complexity of model execution, deployment systems, and the vast space of configurable strategies.
DynaServe: Unified and Elastic Execution for Dynamic Disaggregated LLM ServingChaoyi Ruan, Yinhe Chen, Dongqi Tian, Yandong Shi, Yongji Wu, Jialin Li, Cheng Li2025-04-12下载LLM inference must meet strict latency SLOs (e.g., 100 ms P99 time-between-tokens) while maximizing goodput. Yet, real-world variability in prompt and response lengths skews compute-intensive prefill ...
Revisiting Computational Storage for Data Integrity and SecurityChao Shi, Anthony Manschula, Tabassum Mahmud, Zeren Yang, Mai Zheng, Yong Chen, Jim Wayda, Matthew Wolf, Byungwoo Bang2025-04-12下载The idea of computational storage device (CSD) has come a long way since at least 1990s [1], [2]. By embedding computing resources within storage devices, CSDs could potentially offload computational ...
SW-TNC : Reaching the Most Complex Random Quantum Circuit via Tensor Network ContractionYaojian Chen, Zhaoqi Sun, Chengyu Qiu, Zegang Li, Yanfei Liu, Lin Gan, Xiaohui Duan, Guangwen Yang2025-04-12下载Classical simulation is essential in quantum algorithm development and quantum device verification. With the increasing complexity and diversity of quantum circuit structures, existing classical simul...
Parallel Seismic Data Processing Performance with Cloud-based StorageSasmita Mohapatra, Weiming Yang, Zhengtang Yang, Chenxiao Wang, Jinxin Ma, Gary L. Pavlis, Yinzhi Wang2025-04-12下载This article introduces a general processing framework to effectively utilize waveform data stored on modern cloud platforms. The focus is hybrid processing schemes where a local system drives process...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
A Datagram Extension to DNS over QUIC: Proven Resource Conservation in the Internet of ThingsDarius Saif, Ashraf Matrawy2025-04-12下载In this paper, we investigate the Domain Name System (DNS) over QUIC (DoQ) and propose a non-disruptive extension, which can greatly reduce DoQ's resource consumption.
RSLAQ -- A Robust SLA-driven 6G O-RAN QoS xApp using deep reinforcement learningNoe M. Yungaicela-Naula, Vishal Sharma, Sandra Scott-Hayward2025-04-12下载The evolution of 6G envisions a wide range of applications and services characterized by highly differentiated and stringent Quality of Service (QoS) requirements.
Empirically Measuring Data Localization in the EUAlexander Gamero-Garrido, Kicho Yu, Sumukh Vasisht Shankar, Sachin Kumar Singh, Sindhya Balasubramanian, Alexander Wilcox, David Choffnes2025-04-12下载EU data localization regulations limit data transfers to non-EU countries with the GDPR. However, BGP, DNS and other Internet protocols were not designed to enforce jurisdictional constraints, so impl...

cs.PF - Performance

标题作者发布日期PDF摘要
Identifying Protein Co-regulatory Network Logic by Solving B-SAT Problems through Gate-based Quantum ComputingAspen Erlandsson Brisebois, Jason Broderick, Zahed Khatooni, Heather L. Wilson, Steven Rayan, Gordon Broderick2025-04-12下载There is growing awareness that the success of pharmacologic interventions on living organisms is significantly impacted by context and timing of exposure.

基于 VitePress 构建