Skip to content

2025-11-20

cs.AR - Architecture

标题作者发布日期PDF摘要
Vorion: A RISC-V GPU with Hardware-Accelerated 3D Gaussian Rendering and TrainingYipeng Wang, Mengtian Yang, Chieh-pu Lo, Jaydeep P. Kulkarni2025-11-20下载3D Gaussian Splatting (3DGS) has recently emerged as a foundational technique for real-time neural rendering, 3D scene generation, volumetric video (4D) capture.
Unsupervised Graph Neural Network Framework for Balanced Multipatterning in Advanced Electronic Design Automation LayoutsAbdelrahman Helaly, Nourhan Sakr, Kareem Madkour, Ilhami Torunoglu2025-11-20下载Multipatterning is an essential decomposition strategy in electronic design automation (EDA) that overcomes lithographic limitations when printing dense circuit layouts.
CIMinus: Empowering Sparse DNN Workloads Modeling and Exploration on SRAM-based CIM ArchitecturesYingjie Qi, Jianlei Yang, Rubing Yang, Cenlin Duan, Xiaolin He, Ziyan He, Weitao Pan, Weisheng Zhao2025-11-20下载Compute-in-memory (CIM) has emerged as a pivotal direction for accelerating workloads in the field of machine learning, such as Deep Neural Networks (DNNs).
Mitigating Shared Storage Congestion Using Control TheoryThomas Collignon, Kouds Halitim, Raphaël Bleuse, Sophie Cerf, Bogdan Robu, Éric Rutten, Lionel Seinturier, Alexandre van Kempen2025-11-20下载Efficient data access in High-Performance Computing (HPC) systems is essential to the performance of intensive computing tasks. Traditional optimizations of the I/O stack aim to improve peak performan...
KAN-SAs: Efficient Acceleration of Kolmogorov-Arnold Networks on Systolic ArraysSohaib Errabii, Olivier Sentieys, Marcello Traiola2025-11-20下载Kolmogorov-Arnold Networks (KANs) have garnered significant attention for their promise of improved parameter efficiency and explainability compared to traditional Deep Neural Networks (DNNs).
Can Asymmetric Tile Buffering Be Beneficial?Chengyue Wang, Wesley Pang, Xinrui Wu, Gregory Jun, Luis Romero, Endri Taka, Diana Marculescu, Tony Nowatzki, Pranathi Vasireddy, Joseph Melber, Deming Chen, Jason Cong2025-11-20下载General matrix multiplication (GEMM) is the computational backbone of modern AI workloads, and its efficiency is critically dependent on effective tiling strategies.
A Scalable NorthPole System with End-to-End Vertical Integration for Low-Latency and Energy-Efficient LLM InferenceMichael V. DeBole, Rathinakumar Appuswamy, Neil McGlohon, Brian Taba, Steven K. Esser, Filipp Akopyan, John V. Arthur, Arnon Amir, Alexander Andreopoulos, Peter J. Carlson, Andrew S. Cassidy, Pallab Datta, Myron D. Flickner, Rajamohan Gandhasri, Guillaume J. Garreau, Megumi Ito, Jennifer L. Klamo, Jeffrey A. Kusnitz, Nathaniel J. McClatchey, Jeffrey L. McKinstry, Tapan K. Nayak, Carlos Ortega Otero, Hartmut Penner, William P. Risk, Jun Sawada, Jay Sivagnaname, Daniel F. Smith, Rafael Sousa, Ignacio Terrizzano, Takanori Ueda, Trent Gray-Donald, David Cox, Dharmendra S. Modha2025-11-20下载A vertically integrated, end-to-end, research prototype system combines 288 NorthPole neural inference accelerator cards, offline training algorithms, a high-performance runtime stack, and a container...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive DrafterQinghao Hu, Shang Yang, Junxian Guo, Xiaozhe Yao, Yujun Lin, Yuxian Gu, Han Cai, Chuang Gan, Ana Klimovic, Song Han2025-11-20下载The emergence of Large Language Models (LLMs) with strong reasoning capabilities marks a significant milestone, unlocking new frontiers in complex problem-solving.
Distributed MIS Algorithms for Rational Agents using GamesNithin Salevemula, Shreyas Pai2025-11-20下载We study the problem of computing a Maximal Independent Set (MIS) in distributed networks where each node is a rational agent whose payoff depends on whether it joins the MIS.
Optimizing Federated Learning in the Era of LLMs: Message Quantization and StreamingZiyue Xu, Zhihong Zhang, Holger R. Roth, Chester Chen, Yan Cheng, Andrew Feng2025-11-20下载Federated Learning (FL) offers a promising solution for training machine learning models across distributed data sources while preserving data privacy.
A Fast Relax-and-Round Approach to Unit Commitment for Data Center Own GenerationShaked Regev, Eve Tsybina, Slaven Peles2025-11-20下载The rapid growth of data centers increasingly requires data center operators to "bring own generation" to complement the available utility power plants to supply all or part of data center load.
Optimizations on Graph-Level for Domain Specific Computations in Julia and Application to QEDAnton Reinhard, Simeon Ehrig, René Widera, Michael Bussmann, Uwe Hernandez Acosta2025-11-20下载Complex computational problems in science often consist of smaller parts that can have largely distinct compute requirements from one another.
Fast LLM Post-training via Decoupled and Fastest-of-N SpeculationRongxin Cheng, Kai Zhou, Xingda Wei, Siyuan Liu, Mingcong Han, Mingjing Ai, Yeju Zhou, Baoquan Zhong, Wencong Xiao, Rong Chen, Haibo Chen2025-11-20下载Rollout dominates the training time in large language model (LLM) post-training, where the trained model is used to generate tokens given a batch of prompts.
Mitigating Shared Storage Congestion Using Control TheoryThomas Collignon, Kouds Halitim, Raphaël Bleuse, Sophie Cerf, Bogdan Robu, Éric Rutten, Lionel Seinturier, Alexandre van Kempen2025-11-20下载Efficient data access in High-Performance Computing (HPC) systems is essential to the performance of intensive computing tasks. Traditional optimizations of the I/O stack aim to improve peak performan...
Pipelined Dense Symmetric Eigenvalue Decomposition on Multi-GPU ArchitecturesHansheng Wang, Ruiyi Zhan, Dajun Huang, Xingchen Liu, Qiao Li, Hancong Duan, Dingwen Tao, Guangming Tan, Shaoshuai Zhang2025-11-20下载Large symmetric eigenvalue problems are commonly observed in many disciplines such as Chemistry and Physics, and several libraries including cuSOLVERMp, MAGMA and ELPA support computing large eigenval...
Can Asymmetric Tile Buffering Be Beneficial?Chengyue Wang, Wesley Pang, Xinrui Wu, Gregory Jun, Luis Romero, Endri Taka, Diana Marculescu, Tony Nowatzki, Pranathi Vasireddy, Joseph Melber, Deming Chen, Jason Cong2025-11-20下载General matrix multiplication (GEMM) is the computational backbone of modern AI workloads, and its efficiency is critically dependent on effective tiling strategies.
Digital Agriculture Sandbox for Collaborative ResearchOsama Zafar, Rosemarie Santa González, Alfonso Morales, Erman Ayday2025-11-20下载Digital agriculture is transforming the way we grow food by utilizing technology to make farming more efficient, sustainable, and productive. This modern approach to agriculture generates a wealth of ...
Efficient Chromosome Parallelization for Precision Medicine Genomic WorkflowsDaniel Mas Montserrat, Ray Verma, Míriam Barrabés, Francisco M. de la Vega, Carlos D. Bustamante, Alexander G. Ioannidis2025-11-20下载Large-scale genomic workflows used in precision medicine can process datasets spanning tens to hundreds of gigabytes per sample, leading to high memory spikes, intensive disk I/O, and task failures du...
Optimizing Communication in Byzantine Agreement Protocols with Slim-HBBFTNasit S Sony, Xianzhong Ding2025-11-20下载Byzantine agreement protocols in asynchronous networks have received renewed interest because they do not rely on network behavior to achieve termination.
A Scalable NorthPole System with End-to-End Vertical Integration for Low-Latency and Energy-Efficient LLM InferenceMichael V. DeBole, Rathinakumar Appuswamy, Neil McGlohon, Brian Taba, Steven K. Esser, Filipp Akopyan, John V. Arthur, Arnon Amir, Alexander Andreopoulos, Peter J. Carlson, Andrew S. Cassidy, Pallab Datta, Myron D. Flickner, Rajamohan Gandhasri, Guillaume J. Garreau, Megumi Ito, Jennifer L. Klamo, Jeffrey A. Kusnitz, Nathaniel J. McClatchey, Jeffrey L. McKinstry, Tapan K. Nayak, Carlos Ortega Otero, Hartmut Penner, William P. Risk, Jun Sawada, Jay Sivagnaname, Daniel F. Smith, Rafael Sousa, Ignacio Terrizzano, Takanori Ueda, Trent Gray-Donald, David Cox, Dharmendra S. Modha2025-11-20下载A vertically integrated, end-to-end, research prototype system combines 288 NorthPole neural inference accelerator cards, offline training algorithms, a high-performance runtime stack, and a container...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
A streaming algorithm and hardware accelerator for top-K flow detection in network trafficCarolina Gallardo-Pavesi, Yaime Fernández, Javier E. Soto, Cecilia Hernández, Miguel Figueroa2025-11-20下载Identifying the largest K flows in network traffic is an important task for applications such as flow scheduling and anomaly detection, which aim to improve network efficiency and security.
Performance Comparison of 5G NR Uplink MIMO and Uplink Carrier Aggregations on Commercial NetworkHenry Shao, Kasidis Arunruangsirilert2025-11-20下载Demands for uplink on mobile networks are increasing with the rapid development of social media platforms, 4K/8K content creation, IoT applications, and Fixed Wireless Access (FWA) broadband.
Optimizing Quantum Key Distribution Network Performance using Graph Neural NetworksAkshit Pramod Anchan, Ameiy Acharya, Leki Chom Thungon2025-11-20下载This paper proposes an optimization of Quantum Key Distribution (QKD) Networks using Graph Neural Networks (GNN) framework. Today, the development of quantum computers threatens the security systems o...
Payment-failure times for random Lightning pathsTaki E. M. Abedesselam, Fabio Giacomelli, Francesco Pasquale, Michele Salvi2025-11-20下载We study a random process over graphs inspired by the way payments are executed in the Lightning Network, the main layer-two solution on top of Bitcoin.
Reasoning Meets Representation: Envisioning Neuro-Symbolic Wireless Foundation ModelsJaron Fontaine, Mohammad Cheraghinia, John Strassner, Adnan Shahid, Eli De Poorter2025-11-20下载Recent advances in Wireless Physical Layer Foundation Models (WPFMs) promise a new paradigm of universal Radio Frequency (RF) representations.
Toward hyper-adaptive AI-enabled 6G networks for energy efficiency: techniques, classifications and tradeoffsMariem Zayene, Oussama Habachi, Gerard Chalhoub2025-11-20下载Energy efficiency is shaping up to be one of the most challenging issues for 6G networks. The reason is fairly straightforward: Networks will need to meet extreme service demands while remaining susta...
Multi-Band Wireless Access-and-Backhaul (WAB) for 5G: Implementation and ExperimentsChiara Rubaltelli, Marcello Morini, Eugenio Moro, Ilario Filippini2025-11-20下载Highly dynamic and mobile applications, such as vehicular networks, require stable connectivity, which is often challenging to achieve. Network densification is a key approach to address this issue an...
Green Distributed AI Training: Orchestrating Compute Across Renewable-Powered Micro DatacentersGiuseppe Tomei, Andrea Mayer, Giuseppe Alcini, Stefano Salsano2025-11-20下载The accelerating expansion of AI workloads is colliding with an energy landscape increasingly dominated by intermittent renewable generation. While vast quantities of zero-carbon energy are routinely ...
Bio-inspired Integrated Networking and Control for Large-Scale Swarm: A Hierarchical Co-designHuan Lin, Dakai Liu, Lianghui Ding, Lin Wang, Feng Yang2025-11-20下载Unmanned aerial vehicle (UAV) swarms encounter the challenge of high overhead due to both network management and formation control requirements.
Modeling Pointing, Acquisition, and Tracking Delays in Free-Space Optical Satellite NetworksJason Gerard, Juan A. Fraire, Sandra Céspedes2025-11-20下载Free-space optical inter-satellite links (OISLs) enable high-capacity space communications but require precise Pointing, Acquisition, and Tracking (PAT) between links.
Graph-Aware Temporal Encoder Based Service Migration and Resource Allocation in Satellite NetworksHaotong Wang, Jun Du, Chunxiao Jiang, Jintao Wang, Mérouane Debbah, Zhu Han2025-11-20下载The rapid expansion of latency-sensitive applications has sparked renewed interest in deploying edge computing capabilities aboard satellite constellations, aiming to achieve truly global and seamless...
Machine Learning Epidemic Predictions Using Agent-based Wireless Sensor Network ModelsChukwunonso Henry Nwokoye, Blessing Oluchi, Sharna Waldron, Peace Ezzeh2025-11-20下载The lack of epidemiological data in wireless sensor networks (WSNs) is a fundamental difficulty in constructing robust models to forecast and mitigate threats such as viruses and worms.

cs.PF - Performance

标题作者发布日期PDF摘要
Algorithms and optimizations for global non-linear hybrid fluid-kinetic finite element stellarator simulationsLuca Venerando Greco2025-11-20下载Predictive modeling of stellarator plasmas is crucial for advancing nuclear fusion energy, yet it faces unique computational difficulties. One of the main challenges is accurately simulating the dynam...
Optimizations on Graph-Level for Domain Specific Computations in Julia and Application to QEDAnton Reinhard, Simeon Ehrig, René Widera, Michael Bussmann, Uwe Hernandez Acosta2025-11-20下载Complex computational problems in science often consist of smaller parts that can have largely distinct compute requirements from one another.
Can Asymmetric Tile Buffering Be Beneficial?Chengyue Wang, Wesley Pang, Xinrui Wu, Gregory Jun, Luis Romero, Endri Taka, Diana Marculescu, Tony Nowatzki, Pranathi Vasireddy, Joseph Melber, Deming Chen, Jason Cong2025-11-20下载General matrix multiplication (GEMM) is the computational backbone of modern AI workloads, and its efficiency is critically dependent on effective tiling strategies.
Efficient Chromosome Parallelization for Precision Medicine Genomic WorkflowsDaniel Mas Montserrat, Ray Verma, Míriam Barrabés, Francisco M. de la Vega, Carlos D. Bustamante, Alexander G. Ioannidis2025-11-20下载Large-scale genomic workflows used in precision medicine can process datasets spanning tens to hundreds of gigabytes per sample, leading to high memory spikes, intensive disk I/O, and task failures du...

基于 VitePress 构建