Skip to content

2025-09-30

cs.AR - Architecture

标题作者发布日期PDF摘要
A Compact, Low Power Transprecision ALU for Smart Edge DevicesAyushi Dube, Gian Singh, Sarma Vrudhula2025-09-30下载Transprecision computing (TC) is a promising approach for energy-efficient machine learning (ML) computation on resource-constrained platforms.
Rearchitecting Datacenter Lifecycle for AI: A TCO-Driven FrameworkJovan Stojkovic, Chaojie Zhang, Íñigo Goiri, Ricardo Bianchini2025-09-30下载The rapid rise of large language models (LLMs) has been driving an enormous demand for AI inference infrastructure, mainly powered by high-end GPUs.
Stab-QRAM: An All-Clifford Quantum Random Access Memory for Special DataGuangyi Li, Yu Gan, Zeguan Wu, Xueyue Zhang, Zheshen Zhang, Junyu Liu2025-09-30下载Quantum random access memories (QRAMs) are pivotal for data-intensive quantum algorithms, but existing general-purpose and domain-specific architectures are hampered by a critical bottleneck: a heavy ...
Robust NbN on Si-SiGe hybrid superconducting-semiconducting microwave quantum circuitPaniz Foshat, Samane Kalhor, Shima Poorgholam-khanjari, Douglas Paul, Martin Weides, Kaveh Delfanazari2025-09-30下载Advancing large-scale quantum computing requires superconducting circuits that combine long coherence times with compatibility with semiconductor technology.
TrackCore-F: Deploying Transformer-Based Subatomic Particle Tracking on FPGAsArjan Blankestijn, Uraz Odyurt, Amirreza Yousefzadeh2025-09-30下载The Transformer Machine Learning (ML) architecture has been gaining considerable momentum in recent years. In particular, computational High-Energy Physics tasks such as jet tagging and particle track...
Benchmarking Deep Learning Convolutions on Energy-constrained CPUsEnrique Galvez, Adrien Cassagne, Alix Munier, Manuel Bouyer2025-09-30下载This work evaluates State-of-the-Art convolution algorithms for CPU-based CNN inference. Although most prior studies focus on GPUs or NPUs, CPU implementations remain comparatively under-optimized.
Runtime Energy Monitoring for RISC-V Soft-CoresAlberto Scionti, Paolo Savio, Francesco Lubrano, Olivier Terzo, Marco Ferretti, Florin Apopei, Juri Bellucci, Ennio Spano, Luca Carriere2025-09-30下载Energy efficiency is one of the major concern in designing advanced computing infrastructures. From single nodes to large-scale systems (data centers), monitoring the energy consumption of the computi...
Enabling Time-Aware Priority Traffic Management over Distributed FPGA NodesAlberto Scionti, Paolo Savio, Francesco Lubrano, Federico Stirano, Antonino Nespola, Olivier Terzo, Corrado De Sio, Luca Sterpone2025-09-30下载Network Interface Cards (NICs) greatly evolved from simple basic devices moving traffic in and out of the network to complex heterogeneous systems offloading host CPUs from performing complex tasks on...
MUSS-TI: Multi-level Shuttle Scheduling for Large-Scale Entanglement Module Linked Trapped-IonXian Wu, Chenghong Zhu, Jingbo Wang, Xin Wang2025-09-30下载Trapped-ion computing is a leading architecture in the pursuit of scalable and high fidelity quantum systems. Modular quantum architectures based on photonic interconnects offer a promising path for s...
CIMNAS: A Joint Framework for Compute-In-Memory-Aware Neural Architecture SearchOlga Krestinskaya, Mohammed E. Fouda, Ahmed Eltawil, Khaled N. Salama2025-09-30下载To maximize hardware efficiency and performance accuracy in Compute-In-Memory (CIM)-based neural network accelerators for Artificial Intelligence (AI) applications, co-optimizing both software and har...
SAIL: SRAM-Accelerated LLM Inference System with Lookup-Table-based GEMVJingyao Zhang, Jaewoo Park, Jongeun Lee, Elaheh Sadredini2025-09-30下载Large Language Model (LLM) inference requires substantial computational resources, yet CPU-based inference remains essential for democratizing AI due to the widespread availability of CPUs compared to...
LLM-Powered Code Analysis and Optimization for Gaussian Splatting KernelsYi Hu, Huiyang Zhou2025-09-30下载3D Gaussian splatting (3DGS) is a transformative technique with profound implications on novel view synthesis and real-time rendering. Given its importance, there have been many attempts to improve it...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
BlockSDN-VC: A SDN-Based Virtual Coordinate-Enhanced Transaction Broadcast Framework for High-Performance BlockchainsWenyang Jia, Jingjing Wang, Kai Lei2025-09-30下载Modern blockchains need fast, reliable propagation to balance security and throughput. Virtual-coordinate methods speed dissemination but rely on slow iterative updates, leaving nodes out of sync.
Artificial Intelligence for Cost-Aware Resource Prediction in Big Data PipelinesHarshit Goyal2025-09-30下载Efficient resource allocation is a key challenge in modern cloud computing. Over-provisioning leads to unnecessary costs, while under-provisioning risks performance degradation and SLA violations.
FlowMoE: A Scalable Pipeline Scheduling Framework for Distributed Mixture-of-Experts TrainingYunqi Gao, Bing Hu, Mahdi Boloursaz Mashhadi, A-Long Jin, Yanfeng Zhang, Pei Xiao, Rahim Tafazolli, Merouane Debbah2025-09-30下载The parameter size of modern large language models (LLMs) can be scaled up via the sparsely-activated Mixture-of-Experts (MoE) technique to avoid excessive increase of the computational costs.
LoRAFusion: Efficient LoRA Fine-Tuning for LLMsZhanda Zhu, Qidong Su, Yaoyao Ding, Kevin Song, Shang Wang, Gennady Pekhimenko2025-09-30下载Low-Rank Adaptation (LoRA) has become the leading Parameter-Efficient Fine-Tuning (PEFT) method for Large Language Models (LLMs), as it significantly reduces GPU memory usage while maintaining competi...
Lattica: A Decentralized Cross-NAT Communication Framework for Scalable AI Inference and TrainingWeen Yang, Jason Liu, Suli Wang, Xinyuan Song, Lynn Ai, Eric Yang, Bill Shi2025-09-30下载The rapid expansion of distributed Artificial Intelligence (AI) workloads beyond centralized data centers creates a demand for new communication substrates.
TASP: Topology-aware Sequence ParallelismYida Wang, Ke Hong, Xiuhong Li, Yuanchao Xu, Wenxun Wang, Guohao Dai, Yu Wang2025-09-30下载Long-context large language models (LLMs) face constraints due to the quadratic complexity of the self-attention mechanism. The mainstream sequence parallelism (SP) method, Ring Attention, attempts to...
Rearchitecting Datacenter Lifecycle for AI: A TCO-Driven FrameworkJovan Stojkovic, Chaojie Zhang, Íñigo Goiri, Ricardo Bianchini2025-09-30下载The rapid rise of large language models (LLMs) has been driving an enormous demand for AI inference infrastructure, mainly powered by high-end GPUs.
CSnake: Detecting Self-Sustaining Cascading Failure via Causal Stitching of Fault PropagationsShangshu Qian, Lin Tan, Yongle Zhang2025-09-30下载Recent studies have revealed that self-sustaining cascading failures in distributed systems frequently lead to widespread outages, which are challenging to contain and recover from.
Tuning the Tuner: Introducing Hyperparameter Optimization for Auto-TuningFloris-Jan Willemsen, Rob V. van Nieuwpoort, Ben van Werkhoven2025-09-30下载Automatic performance tuning (auto-tuning) is widely used to optimize performance-critical applications across many scientific domains by finding the best program variant among many choices.
Efficient Construction of Large Search Spaces for Auto-TuningFloris-Jan Willemsen, Rob V. van Nieuwpoort, Ben van Werkhoven2025-09-30下载Automatic performance tuning, or auto-tuning, accelerates high-performance codes by exploring vast spaces of code variants. However, due to the large number of possible combinations and complex constr...
I Like To Move It -- Computation Instead of Data in the BrainFabian Czappa, Marvin Kaster, Felix Wolf2025-09-30下载The detailed functioning of the human brain is still poorly understood. Brain simulations are a well-established way to complement experimental research, but must contend with the computational demand...
Parallax: Efficient LLM Inference Service over Decentralized EnvironmentChris Tong, Youhe Jiang, Gufeng Chen, Tianyi Zhao, Sibian Lu, Wenjie Qu, Eric Yang, Lynn Ai, Binhang Yuan2025-09-30下载Deploying a large language model (LLM) inference service remains costly because centralized serving depends on specialized GPU clusters and high-bandwidth interconnects in datacenters.
AGOCS -- Accurate Google Cloud Simulator FrameworkLeszek Sliwko, Vladimir Getov2025-09-30下载This paper presents the Accurate Google Cloud Simulator (AGOCS) - a novel high-fidelity Cloud workload simulator based on parsing real workload traces, which can be conveniently used on a desktop mach...
Hybrid Dual-Batch and Cyclic Progressive Learning for Efficient Distributed TrainingKuan-Wei Lu, Ding-Yong Hong, Pangfeng Liu, Jan-Jan Wu2025-09-30下载Distributed machine learning is critical for training deep learning models on large datasets with numerous parameters. Current research primarily focuses on leveraging additional hardware resources an...
Enabling Time-Aware Priority Traffic Management over Distributed FPGA NodesAlberto Scionti, Paolo Savio, Francesco Lubrano, Federico Stirano, Antonino Nespola, Olivier Terzo, Corrado De Sio, Luca Sterpone2025-09-30下载Network Interface Cards (NICs) greatly evolved from simple basic devices moving traffic in and out of the network to complex heterogeneous systems offloading host CPUs from performing complex tasks on...
Accelerating LLM Inference with Precomputed Query StorageJay H. Park, Youngju Cho, Choungsol Lee, Moonwook Oh, Euiseong Seo2025-09-30下载Large language model (LLM) inference often suffers from high latency, particularly in resource-constrained environments such as on-device or edge deployments.
PAST: Pilot and Adaptive Orchestration for Timely and Resilient Service Delivery in Edge-Assisted UAV Networks under Spatio-Temporal DynamicsHouyi Qi, Minghui Liwang, Liqun Fu, Sai Zou, Xinlei Yi, Wei Ni, Huaiyu Dai2025-09-30下载Incentive-driven resource trading is essential for UAV applications with intensive, time-sensitive computing demands. Traditional spot trading suffers from negotiation delays and high energy costs, wh...
Layerwise Federated Learning for Heterogeneous Quantum Clients using QuorusJason Han, Nicholas S. DiBrita, Daniel Leeds, Jianqiang Li, Jason Ludmir, Tirthak Patel2025-09-30下载Quantum machine learning (QML) holds the promise to solve classically intractable problems, but, as critical data can be fragmented across private clients, there is a need for distributed QML in a qua...
Adaptive and Resource-efficient Agentic AI Systems for Mobile and Embedded Devices: A SurveySicong Liu, Weiye Wu, Xiangrui Xu, Teng Li, Bowen Pang, Bin Guo, Zhiwen Yu2025-09-30下载Foundation models have reshaped AI by unifying fragmented architectures into scalable backbones with multimodal reasoning and contextual adaptation.
LAPIS: A Performance Portable, High Productivity Compiler FrameworkBrian Kelley, Sivasankaran Rajamanickam2025-09-30下载Portability, performance, and productivity are three critical dimensions for evaluating a programming model or compiler infrastructure. Several modern programming models for computational science focu...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
A Review of Software for Designing and Operating Quantum NetworksRobert J. Hayek, Joaquin Chung, Rajkumar Kettimuthu2025-09-30下载Quantum network protocol development is crucial to realizing a production-grade network that can support distributed sensing, secure communication, and utility-scale quantum computation.
Indoor/Outdoor Spectrum Sharing Enabled by GNSS-based ClassifiersHossein Nasiri, Muhammad Iqbal Rochman, Monisha Ghosh2025-09-30下载The desirability of the mid-band frequency range (1 - 10 GHz) for federal and commercial applications, combined with the growing applications for commercial indoor use-cases, such as factory automatio...
Introducing Large Language Models into the Design Flow of Time Sensitive NetworkingRubi Debnath, Luxi Zhao, Mohammadreza Barzegaran, Sebastian Steinhorst2025-09-30下载The growing demand for real-time, safety-critical systems has significantly increased both the adoption and complexity of Time Sensitive Networking (TSN).
Target Wake Time Scheduling for Time-sensitive and Energy-efficient Wi-Fi NetworksFabio Busacca, Corrado Puligheddu, Francesco Raviglione, Riccardo Rusca, Claudio Casetti, Carla Fabiana Chiasserini, Sergio Palazzo2025-09-30下载Time Sensitive Networking (TSN) is fundamental for the reliable, low-latency networks that will enable the Industrial Internet of Things (IIoT).
Toward an Unbiased Collective Memory for Efficient LLM-Based Agentic 6G Cross-Domain ManagementHatim Chergui, Miguel Catalan Cid, Pouria Sayyad Khodashenas, Daniel Camps Mur, Christos Verikoukis2025-09-30下载This paper introduces a novel framework for proactive cross-domain resource orchestration in 6G RAN-Edge networks, featuring large language model (LLM)-augmented agents.
On Computing Top-kk Simple Shortest Paths from a Single SourceMattia D'Emidio, Gabriele Di Stefano2025-09-30下载We investigate the problem of computing the top-kk simple shortest paths in weighted digraphs. While the single-pair variant -- finding the top-kk simple shortest paths between two specified vertice...
Flexible-Sector 6DMA Base Station: Modeling and DesignYunli Li, Xiaoming Shi, Xiaodan Shao, Jie Xu, Rui Zhang2025-09-30下载Six-dimensional movable antenna (6DMA) has emerged as a promising new technology for future wireless networks, which can adaptively adjust the three-dimensional (3D) positions and 3D rotations of ante...
Knowledge Defined Networking for 6G: A Reinforcement Learning Example for Resource ManagementErol Koçoğlu, Mehmet Ozdem, Tuğçe Bilen2025-09-30下载6G networks are expected to revolutionize connectivity, offering significant improvements in speed, capacity, and smart automation. However, existing network designs will struggle to handle the demand...
User-Centric Comparison of 5G NTN and DVB-S2/RCS2 Using OpenAirInterface and OpenSANDSumit Kumar, Juan Carlos Estrada-Jimenez, Ion Turcanu2025-09-30下载The integration of satellite networks into next-generation mobile communication systems has gained considerable momentum with the advent of 5G Non-Terrestrial Networks (5G-NTN).
OpenID Connect for Agents (OIDC-A) 1.0: A Standard Extension for LLM-Based Agent Identity and AuthorizationSubramanya Nagabhushanaradhya2025-09-30下载OpenID Connect for Agents (OIDC-A) 1.0 is an extension to OpenID Connect Core 1.0 that provides a comprehensive framework for representing, authenticating, and authorizing LLM-based agents within the ...
User-Centric Communication Service Provision for Edge-Assisted Mobile Augmented RealityConghao Zhou, Jie Gao, Shisheng Hu, Nan Cheng, Weihua Zhuang, Xuemin Shen2025-09-30下载Future 6G networks are envisioned to facilitate edge-assisted mobile augmented reality (MAR) via strengthening the collaboration between MAR devices and edge servers.
Intelligent Multi-link EDCA Optimization for Delay-Bounded QoS in Wi-Fi 7Peini Yi, Wenchi Cheng, Jingqing Wang, Jinzhe Pan, Yuehui Ouyang, Wei Zhang2025-09-30下载IEEE 802.11be (Wi-Fi 7) introduces Multi-Link Operation (MLO) as a While MLO offers significant parallelism and capacity, realizing its full potential in guaranteeing strict delay bounds and optimizin...
From Literature to Insights: Methodological Guidelines for Survey Writing in Communications ResearchDusit Niyato, Octavia A. Dobre, Trung Q. Duong, George K. Karagiannidis, Robert Schober2025-09-30下载The rapid growth of communications and networking research has created an unprecedented demand for high-quality survey and tutorial papers that can synthesize vast bodies of literature into coherent u...
CardioForest: An Explainable Ensemble Learning Model for Automatic Wide QRS Complex Tachycardia Diagnosis from ECGVaskar Chakma, Ju Xiaolin, Heling Cao, Xue Feng, Ji Xiaodong, Pan Haiyan, Gao Zhan2025-09-30下载This study aims to develop and evaluate an ensemble machine learning-based framework for the automatic detection of Wide QRS Complex Tachycardia (WCT) from ECG signals, emphasizing diagnostic accuracy...
Think Less, Label Better: Multi-Stage Domain-Grounded Synthetic Data Generation for Fine-Tuning Large Language Models in TelecommunicationsChenhua Shi, Gregor Macdonald, Bhavika Jalli, Wanlu Lei, John Zou, Mridul Jain, Joji Philip2025-09-30下载The success of large language models (LLMs) depends heavily on large-scale, high-quality instruction-following and reinforcement datasets. However, generating such data through human annotation is pro...
PAST: Pilot and Adaptive Orchestration for Timely and Resilient Service Delivery in Edge-Assisted UAV Networks under Spatio-Temporal DynamicsHouyi Qi, Minghui Liwang, Liqun Fu, Sai Zou, Xinlei Yi, Wei Ni, Huaiyu Dai2025-09-30下载Incentive-driven resource trading is essential for UAV applications with intensive, time-sensitive computing demands. Traditional spot trading suffers from negotiation delays and high energy costs, wh...
Oh-Trust: Overbooking and Hybrid Trading-empowered Resource Scheduling with Smart Reputation Update over Dynamic Edge NetworksHouyi Qi, Minghui Liwang, Liqun Fu, Xianbin Wang, Huaiyu Dai, Xiaoyu Xia2025-09-30下载Incentive-driven computing resource sharing is crucial for meeting the ever-growing demands of emerging mobile applications. Although conventional spot trading offers a solution, it frequently leads t...
Deep Reinforcement Learning-Based Precoding for Multi-RIS-Aided Multiuser Downlink Systems with Practical Phase ShiftPo-Heng Chou, Bo-Ren Zheng, Wan-Jen Huang, Walid Saad, Yu Tsao, Ronald Y. Chang2025-09-30下载This study considers multiple reconfigurable intelligent surfaces (RISs)-aided multiuser downlink systems with the goal of jointly optimizing the transmitter precoding and RIS phase shift matrix to ma...
Capacity-Net-Based RIS Precoding Design without Channel Estimation for mmWave MIMO SystemChun-Yuan Huang, Po-Heng Chou, Wan-Jen Huang, Ying-Ren Chien, Yu Tsao2025-09-30下载In this paper, we propose Capacity-Net, a novel unsupervised learning approach aimed at maximizing the achievable rate in reflecting intelligent surface (RIS)-aided millimeter-wave (mmWave) multiple i...

cs.PF - Performance

标题作者发布日期PDF摘要
Regression Language Models for CodeYash Akhauri, Xingyou Song, Arissa Wongpanich, Bryan Lewandowski, Mohamed S. Abdelfattah2025-09-30下载We study code-to-metric regression: predicting numeric outcomes of code executions, a challenging task due to the open-ended nature of programming languages.
Tuning the Tuner: Introducing Hyperparameter Optimization for Auto-TuningFloris-Jan Willemsen, Rob V. van Nieuwpoort, Ben van Werkhoven2025-09-30下载Automatic performance tuning (auto-tuning) is widely used to optimize performance-critical applications across many scientific domains by finding the best program variant among many choices.
Efficient Construction of Large Search Spaces for Auto-TuningFloris-Jan Willemsen, Rob V. van Nieuwpoort, Ben van Werkhoven2025-09-30下载Automatic performance tuning, or auto-tuning, accelerates high-performance codes by exploring vast spaces of code variants. However, due to the large number of possible combinations and complex constr...
AGOCS -- Accurate Google Cloud Simulator FrameworkLeszek Sliwko, Vladimir Getov2025-09-30下载This paper presents the Accurate Google Cloud Simulator (AGOCS) - a novel high-fidelity Cloud workload simulator based on parsing real workload traces, which can be conveniently used on a desktop mach...
Hybrid Schemes of NIST Post-Quantum Cryptography Standard Algorithms and Quantum Key Distribution for Key Exchange and Digital SignatureAbel C. H. Chen2025-09-30下载Since the security of post-quantum cryptography (PQC) algorithms is based on the hardness of mathematical problems, while the security of quantum key distribution (QKD) relies on the fundamental princ...

基于 VitePress 构建