Skip to content

2025-11-25

cs.AR - Architecture

标题作者发布日期PDF摘要
Tekum: Balanced Ternary Tapered Precision Real ArithmeticLaslo Hunhold2025-11-25下载In light of recent hardware advances, it is striking that real arithmetic in balanced ternary logic has received almost no attention in the literature.
Accelerating Sparse Convolutions in Voxel-Based Point Cloud NetworksDionysios Adamopoulos, Anastasia Poulopoulou, Georgios Goumas, Christina Giannoula2025-11-25下载Sparse Convolution (SpC) powers 3D point cloud networks widely used in autonomous driving and AR/VR. SpC builds a kernel map that stores mappings between input voxel coordinates, output coordinates, a...
QiMeng-CRUX: Narrowing the Gap Between Natural Language and Verilog via Core Refined Understanding eXpression for Circuit DesignLei Huang, Rui Zhang, Jiaming Guo, Yang Zhang, Di Huang, Shuyao Cheng, Pengwei Jin, Chongxiao Li, Zidong Du, Xing Hu, Yunji Chen, Qi Guo2025-11-25下载Large language models (LLMs) have shown promising capabilities in hardware description language (HDL) generation. However, existing approaches often rely on free-form natural language descriptions tha...
R3A: Reliable RTL Repair Framework with Multi-Agent Fault Localization and Stochastic Tree-of-Thoughts Patch GenerationZizhang Luo, Fan Cui, Kexing Zhou, Runlin Guo, Mile Xia, Hongyuan Hou, Yun Liang2025-11-25下载Repairing RTL bugs is crucial for hardware design and verification. Traditional automatic program repair (APR) methods define dedicated search spaces to locate and fix bugs with program synthesis.
InF-ATPG: Intelligent FFR-Driven ATPG with Advanced Circuit Representation Guided Reinforcement LearningBin Sun, Rengang Zhang, Zhiteng Chao, Zizhen Liu, Jianan Mu, Jing Ye, Huawei Li2025-11-25下载Automatic test pattern generation (ATPG) is a crucial process in integrated circuit (IC) design and testing, responsible for efficiently generating test patterns.
Pickle Prefetcher: Programmable and Scalable Last-Level Cache PrefetcherHoa Nguyen, Pongstorn Maidee, Jason Lowe-Power, Alireza Kaviani2025-11-25下载Modern high-performance architectures employ large last-level caches (LLCs). While large LLCs can reduce average memory access latency for workloads with a high degree of locality, they can also incre...
Mitigating hallucinations and omissions in LLMs for invertible problems: An application to hardware logic design automationAndrew S. Cassidy, Guillaume Garreau, Jay Sivagnaname, Mike Grassi, Bernard Brezzo, John V. Arthur, Dharmendra S. Modha2025-11-25下载We show for invertible problems that transform data from a source domain (for example, Logic Condition Tables (LCTs)) to a destination domain (for example, Hardware Description Language (HDL) code), a...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Readout-Side Bypass for Residual Hybrid Quantum-Classical ModelsGuilin Zhang, Wulan Guo, Ziqi Tan, Hongyang He, Qiang Guan, Hailong Jiang2025-11-25下载Quantum machine learning (QML) promises compact and expressive representations, but suffers from the measurement bottleneck - a narrow quantum-to-classical readout that limits performance and amplifie...
Accelerating Sparse Convolutions in Voxel-Based Point Cloud NetworksDionysios Adamopoulos, Anastasia Poulopoulou, Georgios Goumas, Christina Giannoula2025-11-25下载Sparse Convolution (SpC) powers 3D point cloud networks widely used in autonomous driving and AR/VR. SpC builds a kernel map that stores mappings between input voxel coordinates, output coordinates, a...
Assessing Redundancy Strategies to Improve Availability in Virtualized System ArchitecturesAlison Silva, Gustavo Callou2025-11-25下载Cloud-based storage platforms are becoming more common in both academic and business settings due to their flexible access to data and support for collaborative functionalities.
Efficient Parallel Implementation of the Pilot Assignment Problem in Massive MIMO SystemsEman Alqudah, Ashfaq Khokhar2025-11-25下载The assignment of the pilot sequence is a critical challenge in massive MIMO systems, as sharing the same pilot sequence among multiple users causes interference, which degrades the accuracy of the ch...
Interactive Visualization of Proof-of-Work Consensus Protocol on Raspberry PiAnton Ivashkevich, Matija Piškorec, Claudio J. Tessone2025-11-25下载We describe a prototype of a fully capable Ethereum Proof-of-Work (PoW) blockchain network running on multiple Raspberry Pi (RPi) computers. The prototype is easy to set up and is intended to function...
Energy-Efficient Federated Learning via Adaptive Encoder Freezing for MRI-to-CT Conversion: A Green AI-Guided ResearchCiro Benito Raggio, Lucia Migliorelli, Nils Skupien, Mathias Krohmer Zabaleta, Oliver Blanck, Francesco Cicone, Giuseppe Lucio Cascini, Paolo Zaffino, Maria Francesca Spadea2025-11-25下载Federated Learning (FL) holds the potential to advance equality in health by enabling diverse institutions to collaboratively train deep learning (DL) models, even with limited data.
Beluga: A CXL-Based Memory Architecture for Scalable and Efficient LLM KVCache ManagementXinjun Yang, Qingda Hu, Junru Li, Feifei Li, Yicong Zhu, Yuqi Zhou, Qiuru Lin, Jian Dai, Yang Kong, Jiayu Zhang, Guoqiang Xu, Qiang Liu2025-11-25下载The rapid increase in LLM model sizes and the growing demand for long-context inference have made memory a critical bottleneck in GPU-accelerated serving systems.
Parallel simulation and adaptive mesh refinement for 3D elastostatic contact mechanics problems between deformable bodiesAlexandre Epalle, Isabelle Ramière, Guillaume Latu, Frédéric Lebon2025-11-25下载Parallel implementation of numerical adaptive mesh refinement (AMR)strategies for solving 3D elastostatic contact mechanics problems is an essential step toward complex simulations that exceed current...
QiMeng-Kernel: Macro-Thinking Micro-Coding Paradigm for LLM-Based High-Performance GPU Kernel GenerationXinguo Zhu, Shaohui Peng, Jiaming Guo, Yunji Chen, Qi Guo, Yuanbo Wen, Hang Qin, Ruizhi Chen, Qirui Zhou, Ke Gao, Yanjun Wu, Chen Zhao, Ling Li2025-11-25下载Developing high-performance GPU kernels is critical for AI and scientific computing, but remains challenging due to its reliance on expert crafting and poor portability.
SwitchDelta: Asynchronous Metadata Updating for Distributed Storage with In-Network Data VisibilityJunru Li, Qing Wang, Zhe Yang, Shuo Liu, Jiwu Shu, Youyou Lu2025-11-25下载Distributed storage systems typically maintain strong consistency between data nodes and metadata nodes by adopting ordered writes: 1) first installing data; 2) then updating metadata to make data vis...
Stragglers Can Contribute More: Uncertainty-Aware Distillation for Asynchronous Federated LearningYujia Wang, Fenglong Ma, Jinghui Chen2025-11-25下载Asynchronous federated learning (FL) has recently gained attention for its enhanced efficiency and scalability, enabling local clients to send model updates to the server at their own pace without wai...
ParaBlock: Communication-Computation Parallel Block Coordinate Federated Learning for Large Language ModelsYujia Wang, Yuanpu Cao, Jinghui Chen2025-11-25下载Federated learning (FL) has been extensively studied as a privacy-preserving training paradigm. Recently, federated block coordinate descent scheme has become a popular option in training large-scale ...
PolarStore: High-Performance Data Compression for Large-Scale Cloud-Native DatabasesQingda Hu, Xinjun Yang, Feifei Li, Junru Li, Ya Lin, Yuqi Zhou, Yicong Zhu, Junwei Zhang, Rongbiao Xie, Ling Zhou, Bin Wu, Wenchao Zhou2025-11-25下载In recent years, resource elasticity and cost optimization have become essential for RDBMSs. While cloud-native RDBMSs provide elastic computing resources via disaggregated computing and storage, stor...
Improved Linear-Time Construction of Minimal Dominating Set via Mobile AgentsPrabhat Kumar Chand, Anisur Rahaman Molla2025-11-25下载Mobile agents have emerged as a powerful framework for solving fundamental graph problems in distributed settings in recent times. These agents, modelled as autonomous physical or software entities, p...
Accelerating Wireless Distributed Learning via Hybrid Split and Federated Learning OptimizationKun Guo, Xuefei Li, Xijun Wang, Howard H. Yang, Wei Feng, Tony Q. S. Quek2025-11-25下载Federated learning (FL) and split learning (SL) are two effective distributed learning paradigms in wireless networks, enabling collaborative model training across mobile devices without sharing raw d...
Batch Denoising for AIGC Service Provisioning in Wireless Edge NetworksJinghang Xu, Kun Guo, Wei Teng, Chenxi Liu, Wei Feng2025-11-25下载Artificial intelligence-generated content (AIGC) service provisioning in wireless edge networks involves two phases: content generation on edge servers and content transmission to mobile devices.
Enabling Scientific Workflow Scheduling Research in Non-Uniform Memory Access ArchitecturesAurelio Vivas, Harold Castro2025-11-25下载Data-intensive scientific workflows increasingly rely on high-performance computing (HPC) systems, complementing traditional Grid and Cloud platforms.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
RIS-Assisted Downlink Pinching-Antenna Systems: GNN-Enabled Optimization ApproachesChangpeng He, Yang Lu, Yanqing Xu, Chong-Yung Chi, Bo Ai, Arumugam Nallanathan2025-11-25下载This paper investigates a reconfigurable intelligent surface (RIS)-assisted multi-waveguide pinching-antenna (PA) system (PASS) for multi-user downlink information transmission, motivated by the unkno...
POMDP-Based Routing for DTNs with Partial Knowledge and Dependent FailuresGregory F. Stock, Alexander Haberl, Juan A. Fraire, Holger Hermanns2025-11-25下载Routing in Delay-Tolerant Networks (DTNs) is inherently challenging due to sparse connectivity, long delays, and frequent disruptions. While Markov Decision Processes (MDPs) have been used to model un...
Improving the Identification of Real-world Malware's DNS Covert Channels Using Locality Sensitive HashingPascal Ruffing, Denis Petrov, Sebastian Zillien, Steffen Wendzel2025-11-25下载Nowadays, malware increasingly uses DNS-based covert channels in order to evade detection and maintain stealthy communication with its command-and-control servers.
Evaluations of High Power User Equipment (HPUE) in Urban EnvironmentKasidis Arunruangsirilert, Pasapong Wongprasert, Jiro Katto2025-11-25下载While Time Division Duplexing (TDD) 5G New Radio (NR) networks offers higher downlink throughput due to the utilization of the middle frequency band, the uplink performance is negatively impacted due ...
Performance Evaluation of Uplink 256QAM on Commercial 5G New Radio (NR) NetworksKasidis Arunruangsirilert, Pasapong Wongprasert, Jiro Katto2025-11-25下载While Uplink 256QAM (UL-256QAM) has been introduced since 2016 as a part of 3GPP Release 14, the adoption was quite poor as many Radio Access Network (RAN) and User Equipment (UE) vendors didn't suppo...
Field Test of 5G New Radio (NR) UL-MIMO and UL-256QAM for HD Live-StreamingKasidis Arunruangsirilert2025-11-25下载The exponential growth of User-Generated Content (UGC), especially High-Definition (HD) live video streaming, places a significant demand on the uplink capabilities of mobile networks.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
SARA: A Stall-Aware Memory Allocation Strategy for Mixed-Criticality SystemsMeng-Chia Lee, Wen Sheng Lim, Yuan-Hao Chang, Tei-Wei Kuo2025-11-25下载The memory capacity in edge devices is often limited due to constraints on cost, size, and power. Consequently, memory competition leads to inevitable page swapping in memory-constrained mixed-critica...

cs.PF - Performance

标题作者发布日期PDF摘要
Accelerating Sparse Convolutions in Voxel-Based Point Cloud NetworksDionysios Adamopoulos, Anastasia Poulopoulou, Georgios Goumas, Christina Giannoula2025-11-25下载Sparse Convolution (SpC) powers 3D point cloud networks widely used in autonomous driving and AR/VR. SpC builds a kernel map that stores mappings between input voxel coordinates, output coordinates, a...
Reducing Latency of LLM Search Agent via Speculation-based Algorithm-System Co-DesignZixiao Huang, Wen Zeng, Tianyu Fu, Tengxuan Liu, Yizhou Sun, Ke Hong, Xinhao Yang, Chengchun Liu, Yan Li, Quanlu Zhang, Guohao Dai, Zhenhua Zhu, Yu Wang2025-11-25下载LLM-based search agents achieve strong performance but suffer from severe latency, as each step requires serialized LLM reasoning followed by action of tool execution.

基于 VitePress 构建