Skip to content

2025-08-26

cs.AR - Architecture

标题作者发布日期PDF摘要
GENIE-ASI: Generative Instruction and Executable Code for Analog Subcircuit IdentificationPhuoc Pham, Arun Venkitaraman, Chia-Yu Hsieh, Andrea Bonetti, Stefan Uhlich, Markus Leibl, Simon Hofmann, Eisaku Ohbuchi, Lorenzo Servadei, Ulf Schlichtmann, Robert Wille2025-08-26下载Analog subcircuit identification is a core task in analog design, essential for simulation, sizing, and layout. Traditional methods often require extensive human expertise, rule-based encoding, or lar...
Architecting Distributed Quantum Computers: Design Insights from Resource EstimationDmitry Filippov, Peter Yang, Prakash Murali2025-08-26下载To enable practically useful quantum computing, we require hundreds to thousands of logical qubits (collections of physical qubits with error correction).
Building an Open CGRA Ecosystem for Agile InnovationRohan Juneja, Pranav Dangi, Thilini Kaushalya Bandara, Zhaoying Li, Dhananjaya Wijerathne, Li-Shiuan Peh, Tulika Mitra2025-08-26下载Modern computing workloads, particularly in AI and edge applications, demand hardware-software co-design to meet aggressive performance and energy targets.
APT-LLM: Exploiting Arbitrary-Precision Tensor Core Computing for LLM AccelerationShaobo Ma, Chao Fang, Haikuo Shao, Zhongfeng Wang2025-08-26下载Large language models (LLMs) have revolutionized AI applications, yet their enormous computational demands severely limit deployment and real-time performance.
TaiBai: A fully programmable brain-inspired processor with topology-aware efficiencyQianpeng Li, Yu Song, Xin Liu, Wenna Song, Boshi Zhao, Zhichao Wang, Aoxin Chen, Tielin Zhang, Liang Chen2025-08-26下载Brain-inspired computing has emerged as a promising paradigm to overcome the energy-efficiency limitations of conventional intelligent systems by emulating the brain's partitioned architecture and eve...
SeDA: Secure and Efficient DNN Accelerators with Hardware/Software SynergyWei Xuan, Zhongrui Wang, Lang Feng, Ning Lin, Zihao Xuan, Rongliang Fu, Tsung-Yi Ho, Yuzhong Jiao, Luhong Liang2025-08-26下载Ensuring the confidentiality and integrity of DNN accelerators is paramount across various scenarios spanning autonomous driving, healthcare, and finance.
Beyond Tokens: Enhancing RTL Quality Estimation via Structural Graph LearningYi Liu, Hongji Zhang, Yiwen Wang, Dimitris Tsaras, Lei Chen, Mingxuan Yuan, Qiang Xu2025-08-26下载Estimating the quality of register transfer level (RTL) designs is crucial in the electronic design automation (EDA) workflow, as it enables instant feedback on key metrics like area and delay without...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Formal Modeling and Verification of the Algorand Consensus Protocol in CADPAndrea Esposito, Francesco P. Rossi, Marco Bernardo, Francesco Fabris, Hubert Garavel2025-08-26下载Algorand is a scalable and secure permissionless blockchain that achieves proof-of-stake consensus via cryptographic self-sortition and binary Byzantine agreement.
HAP: Hybrid Adaptive Parallelism for Efficient Mixture-of-Experts InferenceHaoran Lin, Xianzhi Yu, Kang Zhao, Han Bao, Zongyuan Zhan, Ting Hu, Wulong Liu, Zekun Yin, Xin Li, Weiguo Liu2025-08-26下载Current inference systems for Mixture-of-Experts (MoE) models primarily employ static parallelization strategies. However, these static approaches cannot consistently achieve optimal performance acros...
Architecting Distributed Quantum Computers: Design Insights from Resource EstimationDmitry Filippov, Peter Yang, Prakash Murali2025-08-26下载To enable practically useful quantum computing, we require hundreds to thousands of logical qubits (collections of physical qubits with error correction).
Ab-initio Quantum Transport with the GW Approximation, 42,240 Atoms, and Sustained Exascale PerformanceNicolas Vetsch, Alexander Maeder, Vincent Maillou, Anders Winka, Jiang Cao, Grzegorz Kwasniewski, Leonard Deuschle, Torsten Hoefler, Alexandros Nikolaos Ziogas, Mathieu Luisier2025-08-26下载Designing nanoscale electronic devices such as the currently manufactured nanoribbon field-effect transistors (NRFETs) requires advanced modeling tools capturing all relevant quantum mechanical effect...
Federated Fine-Tuning of Sparsely-Activated Large Language Models on Resource-Constrained DevicesFahao Chen, Jie Wan, Peng Li, Zhou Su, Dongxiao Yu2025-08-26下载Federated fine-tuning of Mixture-of-Experts (MoE)-based large language models (LLMs) is challenging due to their massive computational requirements and the resource constraints of participants.
CARMA: Collocation-Aware Resource ManagerEhsan Yousefzadeh-Asl-Miandoab, Florina M. Ciorba, Pınar Tözün2025-08-26下载GPUs running deep learning (DL) workloads are frequently underutilized. Collocating multiple DL training tasks on the same GPU can improve utilization but introduces two key risks: (1) out-of-memory (...
Dual-Distilled Heterogeneous Federated Learning with Adaptive Margins for Trainable Global PrototypesFatema Siddika, Md Anwar Hossen, Wensheng Zhang, Anuj Sharma, Juan Pablo Muñoz, Ali Jannesari2025-08-26下载Heterogeneous Federated Learning (HFL) has gained significant attention for its capacity to handle both model and data heterogeneity across clients.
Deep Learning-Enabled Supercritical Flame Simulation at Detailed Chemistry and Real-Fluid Accuracy Towards Trillion-Cell ScaleZhuoqiang Guo, Runze Mao, Lijun Liu, Guangming Tan, Weile Jia, Zhi X. Chen2025-08-26下载For decades, supercritical flame simulations incorporating detailed chemistry and real-fluid transport have been limited to millions of cells, constraining the resolved spatial and temporal scales of ...
SIREN: Software Identification and Recognition in HPC SystemsThomas Jakobsche, Fredrik Robertsén, Jessica R. Jones, Utz-Uwe Haus, Florina M. Ciorba2025-08-26下载HPC systems use monitoring and operational data analytics to ensure efficiency, performance, and orderly operations. Application-specific insights are crucial for analyzing the increasing complexity a...
ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective PrimitiveXinhao Luo, Zihan Liu, Yangjie Zhou, Shihan Fang, Ziyu Huang, Yu Feng, Chen Zhang, Shixuan Sun, Zhenzhe Zheng, Jingwen Leng, Minyi Guo2025-08-26下载Large language model (LLM) decoding suffers from high latency due to fragmented execution across operators and heavy reliance on off-chip memory for data exchange and reduction.
Examining MPI and its Extensions for Asynchronous Multithreaded CommunicationJiakun Yan, Marc Snir, Yanfei Guo2025-08-26下载The increasing complexity of HPC architectures and the growing adoption of irregular scientific algorithms demand efficient support for asynchronous, multithreaded communication.
History Rhymes: Accelerating LLM Reinforcement Learning with RhymeRLJingkai He, Tianjian Li, Erhu Feng, Dong Du, Qian Liu, Tao Liu, Yubin Xia, Haibo Chen2025-08-26下载With the rapid advancement of large language models (LLMs), reinforcement learning (RL) has emerged as a pivotal methodology for enhancing the reasoning capabilities of LLMs.
Strata: Hierarchical Context Caching for Long Context Language Model ServingZhiqiang Xie, Ziyi Xu, Mark Zhao, Yuwei An, Vikram Sharma Mailthody, Scott Mahlke, Michael Garland, Christos Kozyrakis2025-08-26下载Large Language Models (LLMs) with expanding context windows face significant performance hurdles. While caching key-value (KV) states is critical for avoiding redundant computation, the storage footpr...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Connectivity Analysis of LoRaWAN-Based Non-Terrestrial Networks for Subterranean mMTCKaiqiang Lin, Mohamed-Slim Alouini2025-08-26下载Wireless underground sensor networks (WUSNs) offer significant social and economic benefits by enabling the monitoring of subterranean entities.
A Theory of Goal-Oriented Medium Access: Protocol Design and Distributed Bandit LearningFederico Chiariotti, Andrea Zanella2025-08-26下载The Goal-oriented Communication (GoC) paradigm breaks the separation between communication and the content of the data, tailoring communication decisions to the specific needs of the receiver and targ...
Sharing is Caring: Analysis of Hybrid Network Sharing Strategies for Energy Efficient Multi-Operator Cellular SystemsLaura Finarelli, Maoquan Ni, Michela Meo, Falko Dressler, Gianluca Rizzo2025-08-26下载This paper introduces a novel analytical framework for evaluating energy-efficient, QoS-aware network-sharing strategies in cellular networks.
OrbCC: High-Throughput and Low-Latency Data Transport for LEO Satellite NetworksAiden Valentine, Ian Wakeman, George Parisis2025-08-26下载The highly dynamic nature of Low-Earth Orbit (LEO) satellite networks introduces challenges that existing transport protocols fail to address, including non-congestive latency variation and loss, tran...
Adaptive 6G Networks-in-Network Management for Industrial ApplicationsDaniel Lindenschmitt, Paul Seehofer, Marius Schmitz, Jan Mertes, Roland Bless, Martina Zitterbart, Jan C. Aurich, Hans D. Schotten2025-08-26下载This paper presents the application of Dynamic Spectrum Management (DSM) for future 6G industrial networks, establishing an efficient controller for the Networks-in-Network (NiN) concept.
Combining Static and Dynamic Traffic with Delay Guarantees in Time-Sensitive NetworkingLisa Maile, Kai-Steffen Hielscher, Reinhard German2025-08-26下载To support reliable and low-latency communication, Time-Sensitive Networking introduced protocols and interfaces for resource allocation in Ethernet.
Saving Energy with Relaxed Latency Constraints: A Study on Data Compression and CommunicationPietro Talli, Anup Mishra, Federico Chiariotti, Israel Leyva-Mayorga, Andrea Zanella, Petar Popovski2025-08-26下载With the advent of edge computing, data generated by end devices can be pre-processed before transmission, possibly saving transmission time and energy.
Network Calculus Results for TSN: An IntroductionLisa Maile, Kai-Steffen Hielscher, Reinhard German2025-08-26下载Time-Sensitive Networking (TSN) is a set of standards that enables the industry to provide real-time guarantees for time-critical communications with Ethernet hardware.
A Survey on Cloud-Edge-Terminal Collaborative Intelligence in AIoT NetworksJiaqi Wu, Jing Liu, Yang Liu, Lixu Wang, Zehua Wang, Wei Chen, Zijian Tian, Richard Yu, Victor C. M. Leung2025-08-26下载The proliferation of Internet of things (IoT) devices in smart cities, transportation, healthcare, and industrial applications, coupled with the explosive growth of AI-driven services, has increased d...
Toward Edge General Intelligence with Agentic AI and Agentification: Concepts, Technologies, and Future DirectionsRuichen Zhang, Guangyuan Liu, Yinqiu Liu, Changyuan Zhao, Jiacheng Wang, Yunting Xu, Dusit Niyato, Jiawen Kang, Yonghui Li, Shiwen Mao, Sumei Sun, Xuemin Shen, Dong In Kim2025-08-26下载The rapid expansion of sixth-generation (6G) wireless networks and the Internet of Things (IoT) has catalyzed the evolution from centralized cloud intelligence towards decentralized edge general intel...
Dynamic Trajectory Optimization and Power Control for Hierarchical UAV Swarms in 6G Aerial Access NetworkZiye Jia, Jia He, Lijun He, Min Sheng, Junyu Liu, Qihui Wu, Zhu Han2025-08-26下载Unmanned aerial vehicles (UAVs) can serve as aerial base stations (BSs) to extend the ubiquitous connectivity for ground users (GUs) in the sixth-generation (6G) era.

cs.OS - Operating Systems

标题作者发布日期PDF摘要
UrgenGo: Urgency-Aware Transparent GPU Kernel Launching for Autonomous DrivingHanqi Zhu, Wuyang Zhang, Xinran Zhang, Ziyang Tao, Xinrui Lin, Yu Zhang, Jianmin Ji, Yanyong Zhang2025-08-26下载The rapid advancements in autonomous driving have introduced increasingly complex, real-time GPU-bound tasks critical for reliable vehicle operation.

cs.PF - Performance

标题作者发布日期PDF摘要
Robust Recursive Query Parallelism in Graph Database Management SystemsAnurag Chakraborty, Semih Salihoğlu2025-08-26下载Efficient multi-core parallel processing of recursive join queries is critical for achieving good performance in graph database management systems (GDBMSs). Prior work adopts two broad approaches.
Exact Persistent Stochastic Non-InterferenceCarla Piazza, Riccardo Romanello, Sabina Rossi2025-08-26下载Persistent Stochastic Non-Interference (PSNI) was introduced to capture a quantitative security property in stochastic process algebras, ensuring that a high-level process does not influence the obser...
CARMA: Collocation-Aware Resource ManagerEhsan Yousefzadeh-Asl-Miandoab, Florina M. Ciorba, Pınar Tözün2025-08-26下载GPUs running deep learning (DL) workloads are frequently underutilized. Collocating multiple DL training tasks on the same GPU can improve utilization but introduces two key risks: (1) out-of-memory (...

基于 VitePress 构建