Skip to content

2025-12-17

cs.AR - Architecture

标题作者发布日期PDF摘要
AIE4ML: An End-to-End Framework for Compiling Neural Networks for the Next Generation of AMD AI EnginesDimitrios Danopoulos, Enrico Lupi, Chang Sun, Sebastian Dittmeier, Michael Kagan, Vladimir Loncar, Maurizio Pierini2025-12-17下载Efficient AI inference on AMD's Versal AI Engine (AIE) is challenging due to tightly coupled VLIW execution, explicit datapaths, and local memory management.
Low-Latency FPGA Control System for Real-Time Neural Network Processing in CCD-Based Trapped-Ion Qubit MeasurementBinglei Lou, Gautham Duddi Krishnaswaroop, Filip Wojcicki, Ruilin Wu, Richard Rademacher, Zhiqiang Que, Wayne Luk, Philip H. W. Leong2025-12-17下载Accurate and low-latency qubit state measurement is critical for trapped-ion quantum computing. While deep neural networks (DNNs) have been integrated to enhance detection fidelity, their latency perf...
A High-level Synthesis Toolchain for the Julia LanguageBenedict Short, Ian McInerney, John Wickerson2025-12-17下载With the push towards Exascale computing and data-driven methods, problem sizes have increased dramatically, increasing the computational requirements of the underlying algorithms.
Workload Characterization for Branch PredictabilityFNU Vikas, Paul Gratz, Daniel Jiménez2025-12-17下载Conditional branch prediction predicts the likely direction of a conditional branch instruction to support ILP extraction. Branch prediction is a pattern recognition problem that learns mappings betwe...
FAME: FPGA Acceleration of Secure Matrix Multiplication with Homomorphic EncryptionZhihan Xu, Rajgopal Kannan, Viktor K. Prasanna2025-12-17下载Homomorphic Encryption (HE) enables secure computation on encrypted data, addressing privacy concerns in cloud computing. However, the high computational cost of HE operations, particularly matrix mul...
Implementation and Analysis of Thermometer Encoding in DWN FPGA AcceleratorsMichael Mecik, Martin Kumm2025-12-17下载Fully parallel neural network accelerators on field-programmable gate arrays (FPGAs) offer high throughput for latency-critical applications but face hardware resource constraints.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
LOG.io: Unified Rollback Recovery and Data Lineage Capture for Distributed Data PipelinesEric Simon, Renato B. Hoffmann, Lucas Alf, Dalvan Griebler2025-12-17下载This paper introduces LOG.io, a comprehensive solution designed for correct rollback recovery and fine-grain data lineage capture in distributed data pipelines.
Private Virtual Tree Networks for Secure Multi-Tenant Environments Based on the VIRGO Overlay NetworkLican Huang2025-12-17下载Hierarchical organization is a fundamental structure in real-world society, where authority and responsibility are delegated from managers to subordinates.
Dynamic Rebatching for Efficient Early-Exit Inference with DREXXuting Liu, Daniel Alexander, Siva Kesava Reddy Kakarla, Behnaz Arzani, Vincent Liu2025-12-17下载Early-Exit (EE) is a Large Language Model (LLM) architecture that accelerates inference by allowing easier tokens to be generated using only a subset of the model's layers.
Optimizing Agentic Language Model Inference via Speculative Tool CallsDaniel Nichols, Prajwal Singhania, Charles Jekel, Abhinav Bhatele, Harshitha Menon2025-12-17下载Language models (LMs) are becoming increasingly dependent on external tools. LM-based agentic frameworks frequently interact with their environment via such tools to search files, run code, call APIs,...
LeaseGuard: Raft Leases Done RightA. Jesse Jiryu Davis, Murat Demirbas, Lingzhi Deng2025-12-17下载Raft is a leading consensus algorithm for replicating writes in distributed databases. However, distributed databases also require consistent reads.
Optimizing Bloom Filters for Modern GPU ArchitecturesDaniel Jünger, Kevin Kristensen, Yunsong Wang, Xiangyao Yu, Bertil Schmidt2025-12-17下载Bloom filters are a fundamental data structure for approximate membership queries, with applications ranging from data analytics to databases and genomics.
TL: Automatic End-to-End Compiler of Tile-Based Languages for Spatial Dataflow ArchitecturesWei Li, Zhenyu Bai, Heru Wang, Pranav Dangi, Zhiqiang Zhang, Cheng Tan, Huiying Lan, Weng-Fai Wong, Tulika Mitra2025-12-17下载Spatial dataflow accelerators are a promising direction for next-generation computer systems because they can reduce the memory bottlenecks of traditional von Neumann machines such as CPUs and GPUs.
LLMQ: Efficient Lower-Precision Pretraining for Consumer GPUsErik Schultheis, Dan Alistarh2025-12-17下载We present LLMQ, an end-to-end CUDA/C++ implementation for medium-sized language-model training, e.g. 3B to 32B parameters, on affordable, commodity GPUs.
Reexamining Paradigms of End-to-End Data MovementChin Fang, Timothy Stitt, Michael J. McManus, Toshio Moriya2025-12-17下载The pursuit of high-performance data transfer often focuses on raw network bandwidth, where international links of 100 Gbps or higher are frequently considered the primary enabler.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Deep Reinforcement Learning for EH-Enabled Cognitive-IoT Under Jamming AttacksNadia Abdolkhani, Nada Abdel Khalek, Walaa Hamouda2025-12-17下载In the evolving landscape of the Internet of Things (IoT), integrating cognitive radio (CR) has become a practical solution to address the challenge of spectrum scarcity, leading to the development of...
Attention in Motion: Secure Platooning via Transformer-based Misbehavior DetectionKonstantinos Kalogiannis, Ahmed Mohamed Hussain, Hexu Li, Panos Papadimitratos2025-12-17下载Vehicular platooning promises transformative improvements in transportation efficiency and safety through the coordination of multi-vehicle formations enabled by Vehicle-to-Everything (V2X) communicat...
GenAI-enabled Residual Motion Estimation for Energy-Efficient Semantic Video CommunicationShavbo Salehi, Pedro Enrique Iturria-Rivera, Medhat Elsayed, Majid Bavand, Yigit Ozcan, Melike Erol-Kantarci2025-12-17下载Semantic communication addresses the limitations of the Shannon paradigm by focusing on transmitting meaning rather than exact representations, thereby reducing unnecessary resource consumption.
Packet-Level Traffic Modeling with Heavy-Tailed Payload and Inter-Arrival Distributions for Digital TwinsEnes Koktas, Peter Rost2025-12-17下载Digital twins of radio access networks require packet-level traffic generators that reproduce the size and timing of packets while remaining compact and easy to recalibrate as traffic changes.
DNS-based dynamic context resolution for SCHCAntoine Bernard, Sandoche Balakrichenan, Michel Marot, Benoit Ampeau2025-12-17下载LPWANs are networks characterised by the scarcity of their radio resources and their limited payload size. LoRaWAN offers an open, easy-to-deploy and efficient solution to operate a long-range network...
More Capacity from Less Spectrum: Tapping into Optical-layer Intelligence in Optical Computing-Communication Integrated NetworkDao Thanh Hai, Shuo Li, Isaac Woungang2025-12-17下载Driven by massive investments and consequently significant progresses in optical computing and all-optical signal processing technologies lately, this paper presents a new architectural paradigm for n...
UAV-enabled Computing Power Networks: Task Completion Probability AnalysisYiqin Deng, Zhengru Fang, Senkang Hu, Yanan Ma, Haixia Zhang, Yuguang Fang2025-12-17下载This paper presents an innovative framework that synergistically enhances computing performance through ubiquitous computing power distribution and dynamic computing node accessibility control via ada...
Deep Reinforcement Learning for Joint Time and Power Management in SWIPT-EH CIoTNadia Abdolkhani, Nada Abdel Khalek, Walaa Hamouda, Iyad Dayoub2025-12-17下载This letter presents a novel deep reinforcement learning (DRL) approach for joint time allocation and power control in a cognitive Internet of Things (CIoT) system with simultaneous wireless informati...
Agentic AI for Integrated Sensing and Communication: Analysis, Framework, and Case StudyWenwen Xie, Geng Sun, Ruichen Zhang, Xuejie Liu, Yinqiu Liu, Jiacheng Wang, Dusit Niyato, Ping Zhang2025-12-17下载Integrated sensing and communication (ISAC) has emerged as a key development direction in the sixth-generation (6G) era, which provides essential support for the collaborative sensing and communicatio...
Reexamining Paradigms of End-to-End Data MovementChin Fang, Timothy Stitt, Michael J. McManus, Toshio Moriya2025-12-17下载The pursuit of high-performance data transfer often focuses on raw network bandwidth, where international links of 100 Gbps or higher are frequently considered the primary enabler.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
Reexamining Paradigms of End-to-End Data MovementChin Fang, Timothy Stitt, Michael J. McManus, Toshio Moriya2025-12-17下载The pursuit of high-performance data transfer often focuses on raw network bandwidth, where international links of 100 Gbps or higher are frequently considered the primary enabler.

cs.PF - Performance

标题作者发布日期PDF摘要
Optimizing Agentic Language Model Inference via Speculative Tool CallsDaniel Nichols, Prajwal Singhania, Charles Jekel, Abhinav Bhatele, Harshitha Menon2025-12-17下载Language models (LMs) are becoming increasingly dependent on external tools. LM-based agentic frameworks frequently interact with their environment via such tools to search files, run code, call APIs,...
Reexamining Paradigms of End-to-End Data MovementChin Fang, Timothy Stitt, Michael J. McManus, Toshio Moriya2025-12-17下载The pursuit of high-performance data transfer often focuses on raw network bandwidth, where international links of 100 Gbps or higher are frequently considered the primary enabler.

基于 VitePress 构建