Skip to content

2025-10-12

cs.AR - Architecture

标题作者发布日期PDF摘要
HYPERDOA: Robust and Efficient DoA Estimation using Hyperdimensional ComputingRajat Bhattacharjya, Woohyeok Park, Arnab Sarkar, Hyunwoo Oh, Mohsen Imani, Nikil Dutt2025-10-12下载Direction of Arrival (DoA) estimation techniques face a critical trade-off, as classical methods often lack accuracy in challenging, low signal-to-noise ratio (SNR) conditions, while modern deep learn...
Bhasha-Rupantarika: Algorithm-Hardware Co-design approach for Multilingual Neural Machine TranslationMukul Lokhande, Tanushree Dewangan, Mohd Sharik Mansoori, Tejas Chaudhari, Akarsh J., Damayanti Lokhande, Adam Teman, Santosh Kumar Vishvakarma2025-10-12下载This paper introduces Bhasha-Rupantarika, a light and efficient multilingual translation system tailored through algorithm-hardware codesign for resource-limited settings.
ADiP: Adaptive-Precision Systolic Array for Matrix Multiplication AccelerationAhmed J. Abdelmaksoud, Cristian Sestito, Shiwei Wang, Themis Prodromakis2025-10-12下载Transformers are at the core of modern AI nowadays. They rely heavily on matrix multiplication and require efficient acceleration due to their substantial memory and computational requirements.
Self-Attention to Operator Learning-based 3D-IC Thermal SimulationZhen Huang, Hong Wang, Wenkai Yang, Muxi Tang, Depeng Xie, Ting-Jung Lin, Yu Zhang, Wei W. Xing, Lei He2025-10-12下载Thermal management in 3D ICs is increasingly challenging due to higher power densities. Traditional PDE-solving-based methods, while accurate, are too slow for iterative design.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
FIDRS: A Novel Framework for Integrated Distributed Reliable SystemsMehdi Zekriyapanah Gashti2025-10-12下载In this paper we represent a new framework for integrated distributed and reliable systems. In the proposed framework we have used three parts to increase Satisfaction and Performance of this framewor...
Fair Kernel-Lock-Free Claim/Release Protocol for Shared Object Access in Cooperatively Scheduled RuntimesKevin Chalmers, Jan Bækgaard Pedersen2025-10-12下载We present the first spin-free, kernel-lock-free mutex that cooperates with user-mode schedulers and is formally proven FIFO-fair and linearizable using CSP/FDR.
SPHERE: Spherical partitioning for large-scale routing optimizationRobert Fabian Lindermann, Paul-Niklas Ken Kandora, Simon Caspar Zeller, Adrian Asmund Fessler, Steffen Rebennack2025-10-12下载We study shortest-path routing in large weighted, undirected graphs, where expanding search frontiers raise time and memory costs for exact solvers.
CPU-Limits kill Performance: Time to rethink Resource ControlChirag Shetty, Sarthak Chakraborty, Hubertus Franke, Larisa Shwartz, Chandra Narayanaswami, Indranil Gupta, Saurabh Jha2025-10-12下载Research in compute resource management for cloud-native applications is dominated by the problem of setting optimal CPU limits -- a fundamental OS mechanism that strictly restricts a container's CPU ...
DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context ParallelismChenyu Jiang, Zhenkun Cai, Ye Tian, Zhen Jia, Yida Wang, Chuan Wu2025-10-12下载Context parallelism has emerged as a key technique to support long-context training, a growing trend in generative AI for modern large models.
Multitask Learning with Learned Task RelationshipsZirui Wan, Stefan Vlaski2025-10-12下载Classical consensus-based strategies for federated and decentralized learning are statistically suboptimal in the presence of heterogeneous local data or task distributions.
A Verified High-Performance Composable Object Library for Remote Direct Memory Access (Extended Version)Guillaume Ambal, George Hodgkins, Mark Madler, Gregory Chockler, Brijesh Dongol, Joseph Izraelevitz, Azalea Raad, Viktor Vafeiadis2025-10-12下载Remote Direct Memory Access (RDMA) is a memory technology that allows remote devices to directly write to and read from each other's memory, bypassing components such as the CPU and operating system.
FLAMMABLE: A Multi-Model Federated Learning Framework with Multi-Model Engagement and Adaptive Batch SizesShouxu Lin, Zimeng Pan, Yuhang Yao, Haeyoung Noh, Pei Zhang, Carlee Joe-Wong2025-10-12下载Multi-Model Federated Learning (MMFL) is an emerging direction in Federated Learning (FL) where multiple models are trained in parallel, generally on various datasets.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
A Framework for AI-Native Semantic-Based Dynamic Slicing for 6G NetworksMayukh Roy Chowdhury, Eman Hammad, Lauri Loven, Susanna Pirttikangas, Aloizio P da Silva, Walid Saad2025-10-12下载In the ensuing ultra-dense and diverse environment in future \ac{6G} communication networks, it will be critical to optimize network resources via mechanisms that recognize and cater to the diversity,...

cs.OS - Operating Systems

标题作者发布日期PDF摘要
CPU-Limits kill Performance: Time to rethink Resource ControlChirag Shetty, Sarthak Chakraborty, Hubertus Franke, Larisa Shwartz, Chandra Narayanaswami, Indranil Gupta, Saurabh Jha2025-10-12下载Research in compute resource management for cloud-native applications is dominated by the problem of setting optimal CPU limits -- a fundamental OS mechanism that strictly restricts a container's CPU ...

cs.PF - Performance

标题作者发布日期PDF摘要
CPU-Limits kill Performance: Time to rethink Resource ControlChirag Shetty, Sarthak Chakraborty, Hubertus Franke, Larisa Shwartz, Chandra Narayanaswami, Indranil Gupta, Saurabh Jha2025-10-12下载Research in compute resource management for cloud-native applications is dominated by the problem of setting optimal CPU limits -- a fundamental OS mechanism that strictly restricts a container's CPU ...
CAPSim: A Fast CPU Performance Simulator Using Attention-based PredictorBuqing Xu, Jianfeng Zhu, Yichi Zhang, Qinyi Cai, Guanhua Li, Shaojun Wei, Leibo Liu2025-10-12下载CPU simulators are vital for computer architecture research, primarily for estimating performance under different programs. This poses challenges for fast and accurate simulation of modern CPUs, espec...

基于 VitePress 构建