Skip to content

2026-03-11

cs.AR - Architecture

标题作者发布日期PDF摘要
Synthesis-in-the-Loop Evaluation of LLMs for RTL Generation: Quality, Reliability, and Failure ModesWeimin Fu, Zeng Wang, Minghao Shao, Ramesh Karri, Muhammad Shafique, Johann Knechtel, Ozgur Sinanoglu, Xiaolong Guo2026-03-11下载RTL generation demands more than software code synthesis: designs must be syntactically valid, synthesizable, functionally correct, and hardware-efficient.
Reference Architecture of a Quantum-Centric SupercomputerSeetharami Seelam, Jerry M. Chow, Antonio Córcoles, Sarah Sheldon, Tushar Mittal, Abhinav Kandala, Sean Dague, Ian Hincks, Hiroshi Horii, Blake Johnson, Michael Le, Hani Jamjoom, Jay M. Gambetta2026-03-11下载Quantum computers have demonstrated utility in simulating quantum systems beyond brute-force classical approaches. As the community builds on these demonstrations to explore using quantum computing fo...
An FPGA Implementation of Displacement Vector Search for Intra Pattern Copy in JPEG XSQiyue Chen, Yao Li, Jie Tao, Song Chen, Li Li, Dong Liu2026-03-11下载Recently, progress has been made on the Intra Pattern Copy (IPC) tool for JPEG XS, an image compression standard designed for low-latency and low-complexity coding.
In-Memory ADC-Based Nonlinear Activation Quantization for Efficient In-Memory ComputingShuai Dong, Junyi Yang, Biyan Zhou, Hongyang Shang, Gourav Datta, Arindam Basu2026-03-11下载In deep networks, operations such as ReLU and hardware-driven clamping often cause activations to accumulate near the edges of the distribution, leading to biased clustering and suboptimal quantizatio...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Reference Architecture of a Quantum-Centric SupercomputerSeetharami Seelam, Jerry M. Chow, Antonio Córcoles, Sarah Sheldon, Tushar Mittal, Abhinav Kandala, Sean Dague, Ian Hincks, Hiroshi Horii, Blake Johnson, Michael Le, Hani Jamjoom, Jay M. Gambetta2026-03-11下载Quantum computers have demonstrated utility in simulating quantum systems beyond brute-force classical approaches. As the community builds on these demonstrations to explore using quantum computing fo...
Data Augmentation and Convolutional Network Architecture Influence on Distributed LearningVictor Forattini Jansen, Emanuel Teixeira Martins, Yasmin Souza Lima, Flavio de Oliveira Silva, Rodrigo Moreira, Larissa Ferreira Rodrigues Moreira2026-03-11下载Convolutional Neural Networks (CNNs) have proven to be highly effective in solving a broad spectrum of computer vision tasks, such as classification, identification, and segmentation.
Topological Analysis for Identifying Anomalies in Serverless PlatformsGianluca Reali, Mauro Femminella2026-03-11下载The information flows in serverless platforms are complex and non-conservative. This is a direct result of how independently deployed functions interact under the platform coarse-grained control mecha...
Aceso: Carbon-Aware and Cost-Effective Microservice Placement for Small and Medium-sized EnterprisesGeorgia Christofidi, Francisco Álvarez-Terribas, Ioannis Roumpos, Nicolas Kourtellis, Jesus Omaña Iglesias, Thaleia Dimitra Doudali2026-03-11下载Microservices are a dominant architecture in cloud computing, offering scalability and modularity, but also posing complex deployment challenges.
CacheSolidarity: Preventing Prefix Caching Side Channels in Multi-tenant LLM Serving SystemsPanagiotis Georgios Pennas, Konstantinos Papaioannou, Marco Guarnieri, Thaleia Dimitra Doudali2026-03-11下载Large Language Models (LLMs) rely on optimizations like Automatic Prefix Caching (APC) to accelerate inference. APC works by reusing previously computed states for the beginning part of a request (pre...
Double-Precision Matrix Multiplication Emulation via Ozaki-II Scheme with FP8 QuantizationYuki Uchino, Katsuhisa Ozaki, Toshiyuki Imamura2026-03-11下载In this paper, we propose a method for emulating double-precision general matrix--matrix multiplication (DGEMM), a fundamental and performance-critical kernel in many high-performance computing applic...
Thousand-GPU Large-Scale Training and Optimization Recipe for AI-Native Cloud Embodied Intelligence InfrastructureYongjian Guo, Yunxuan Ma, Haoran Sun, Zhong Guan, Shuai Di, Jing Long, Wanting Xu, Xiaodong Bai, Wen Huang, Yucheng Guo, Chen Zhou, Qiming Yang, Mingxi Luo, Tianyun Zhao, Hedan Yang, Song Wang, Xiaomeng Tian, Xiaolong Xiang, Zhen Sun, Yu Wei, Luqiao Wang, Yuzhen Li, Chenfeng Gu, Junwu Xiong, Yicheng Gong2026-03-11下载Embodied intelligence is a key step towards Artificial General Intelligence (AGI), yet its development faces multiple challenges including data, frameworks, infrastructure, and evaluation systems.
CD-Raft: Reducing the Latency of Distributed Consensus in Cross-Domain SitesYangyang Wang, Ziqian Cheng, Yucong Dong, Zichen Xu2026-03-11下载Today's massive AI computation loads push heavy data synchronization across sites, i.e., nodes in data centers. Any reduction in such consensus latency can significantly improve the overall performanc...
Estimating the condition number of Chebyshev filtered vectors with application to the ChASE libraryEdoardo Di Napoli, Xinzhe Wu2026-03-11下载Chebyshev filtered subspace iteration is a well-known algorithm for the solution of (symmetric/Hermitian) algebraic eigenproblems which has been implemented in several application codes~\cite{Kronik:2...
COHORT: Hybrid RL for Collaborative Large DNN Inference on Multi-Robot Systems Under Real-Time ConstraintsMohammad Saeid Anwar, Anuradha Ravi, Indrajeet Ghosh, Gaurav Shinde, Carl Busart, Nirmalya Roy2026-03-11下载Large deep neural networks (DNNs), especially transformer-based and multimodal architectures, are computationally demanding and challenging to deploy on resource-constrained edge platforms like field ...
SimulCost: A Cost-Aware Benchmark and Toolkit for Automating Physics Simulations with LLMsYadi Cao, Sicheng Lai, Jiahe Huang, Yang Zhang, Zach Lawrence, Rohan Bhakta, Izzy F. Thomas, Mingyun Cao, Chung-Hao Tsai, Zihao Zhou, Yidong Zhao, Hao Liu, Alessandro Marinoni, Alexey Arefiev, Rose Yu2026-03-11下载Evaluating LLM agents for scientific tasks has focused on token costs while ignoring tool-use costs like simulation time and experimental resources.
S-HPLB: Efficient LLM Attention Serving via Sparsity-Aware Head Parallelism Load BalanceDi Liu, Yifei Liu, Chen Chen, Zhibin Yu, Xiaoyi Fan, Quan Chen, Minyi Guo2026-03-11下载With the increasing volumes of Large Language Models (LLMs) and the expanding context lengths, attention computation has become a key performance bottleneck in LLM serving.
AgentServe: Algorithm-System Co-Design for Efficient Agentic AI Serving on a Consumer-Grade GPUYuning Zhang, Yan Yan, Nan Yang, Dong Yuan2026-03-11下载Large language models (LLMs) are increasingly deployed as AI agents that operate in short reasoning-action loops, interleaving model computation with external calls.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Measurement-Driven O-RAN Diagnostics with Tail Latency and Scheduler IndicatorsTheofanis P. Raptis, Weronika Maria Bachan, Roberto Verdone2026-03-11下载We investigate cross-layer performance diagnostics for an O-RAN instance by jointly analyzing application-level latency and radio-layer behavior from a real measurement campaign.
Expressive Boundedness of Authoritative DNS Response SelectionChris Bertinato2026-03-11下载Authoritative Domain Name System (DNS) response selection defines query-time response selection based on resolver-visible context and per-answer metadata, yielding different observable outcomes for th...
Topological Analysis for Identifying Anomalies in Serverless PlatformsGianluca Reali, Mauro Femminella2026-03-11下载The information flows in serverless platforms are complex and non-conservative. This is a direct result of how independently deployed functions interact under the platform coarse-grained control mecha...
Feasibility of satellite-augmented global quantum repeater networksManik Dawar, Clement Paillet, Nilesh Vyas, Andrew Thain, Rodrigo Henriques Guilherme, Ralf Riedinger2026-03-11下载A large scale quantum network requires the distribution of high-fidelity end-to-end entanglement. To overcome the range limitations inherent to terrestrial fiber, a leading architecture has emerged: s...
Initialization and Rate-Quality Functions for Generative Network Layer ProtocolsMathias Thorsager, Israel Leyva-Mayorga, Petar Popovski2026-03-11下载Generative AI (GenAI) creates full content based on compact prompts. While GenAI has been used for applications where the generated content is returned to the prompt sender, it can play a vital role i...
Towards Intelligent Spectrum Management: Spectrum Demand Estimation Using Graph Neural NetworksMohamad Alkadamani, Amir Ghasemi, Halim Yanikomeroglu2026-03-11下载The growing demand for wireless connectivity, combined with limited spectrum resources, calls for more efficient spectrum management. Spectrum sharing is a promising approach; however, regulators need...
Q-StaR: A Quasi-Static Routing Scheme for NoCsYang Zhang, Yiren Zhao, Xu Wang, Fengyuan Ren2026-03-11下载In networks-on-chip, static routing schemes are favored for their simplicity and predictability, but they cannot effectively balance network load due to the unawareness of runtime load distribution.
Adaptive RAN Slicing Control via Reward-Free Self-Finetuning AgentsYuanhao Li, Haozhe Wang, Geyong Min, Nektarios Georgalas, Wang Miao2026-03-11下载The integration of Generative AI models into AI-native network systems offers a transformative path toward achieving autonomous and adaptive control.
A Secure Splitting and Acceleration Strategy for TCP/QUIC in Interplanetary NetworksJianhao Yu, Ye Li, Qingfang Jiang, Shuai Liu, Wenfeng Li, Kanglian Zhao2026-03-11下载Interplanetary networks (IPNs) present unique challenges such as extreme delay, high loss, and frequent disruptions that severely degrade the performance of conventional transport protocols like Trans...
Spyglass: Directional Spectrum Sensing with Single-shot AoA Estimation and Virtual ArraysRaghav Subbaraman, Akshit Agarwal, Wenhao Chen, Dinesh Bharadia2026-03-11下载In this paper, we introduce Spyglass, a spectrum sensor designed to address the challenges of effective spectrum usage in dense wireless environments.
Taming Vision Priors for Data Efficient mmWave Channel ModelingZhenlin An, Longfei Shangguan, John Kaewell, Philip Pietraski, Jelena Senic, Camillo Gentile, Nada Golmie, Kyle Jamieson2026-03-11下载Accurately modeling millimeter-wave (mmWave) propagation is essential for real-time AR and autonomous systems. Differentiable ray tracing offers a physics-grounded solution but still facing deployment...
Utility Function is All You Need: LLM-based Congestion ControlNeta Rozen-Schiff, Liron Schiff, Stefan Schmid2026-03-11下载Congestion is a critical and challenging problem in communication networks. Congestion control protocols allow network applications to tune their sending rate in a way that optimizes their performance...

cs.PF - Performance

标题作者发布日期PDF摘要
Improving LLM Performance Through Black-Box Online Tuning: A Case for Adding System Specs to Factsheets for Trusted AIYonas Atinafu, Henry Lin, Robin Cohen2026-03-11下载In this paper, we present a novel black-box online controller that uses only end-to-end measurements over short segments, without internal instrumentation, and hill climbing to maximize goodput, defin...
RAGPerf: An End-to-End Benchmarking Framework for Retrieval-Augmented Generation SystemsShaobo Li, Yirui Zhou, Yuan Xu, Kevin Chen, Daniel Waddington, Swaminathan Sundararaman, Hubertus Franke, Jian Huang2026-03-11下载We present the design and implementation of a RAG-based AI system benchmarking (RAGPerf) framework for characterizing the system behaviors of RAG pipelines.

基于 VitePress 构建