Appearance
2025-12-07
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Accurate Models of NVIDIA Tensor Cores | Faizan A. Khattak, Mantas Mikaitis | 2025-12-07 | 下载 | Matrix multiplication is a fundamental operation in both training of neural networks and inference. To accelerate matrix multiplication, Graphical Processing Units (GPUs) provide it implemented in har... |
| ArchPower: Dataset for Architecture-Level Power Modeling of Modern CPU Design | Qijun Zhang, Yao Lu, Mengming Li, Shang Liu, Zhiyao Xie | 2025-12-07 | 下载 | Power is the primary design objective of large-scale integrated circuits (ICs), especially for complex modern processors (i.e., CPUs). Accurate CPU power evaluation requires designers to go through th... |
| Formal that "Floats" High: Formal Verification of Floating Point Arithmetic | Hansa Mohanty, Vaisakh Naduvodi Viswambharan, Deepak Narayan Gadde | 2025-12-07 | 下载 | Formal verification of floating-point arithmetic remains challenging due to non-linear arithmetic behavior and the tight coupling between control and datapath logic. |
| GPU-Accelerated Optimization Solver for Unit Commitment in Large-Scale Power Grids | Hussein Sharadga, Javad Mohammadi | 2025-12-07 | 下载 | This work presents a GPU-accelerated solver for the unit commitment (UC) problem in large-scale power grids. The solver uses the Primal-Dual Hybrid Gradient (PDHG) algorithm to efficiently solve the r... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Optimizing video analytics inference pipelines: a case study | Saeid Ghafouri, Yuming Ding, Katerine Diaz Chito, Jesús Martinez del Rincón, Niamh O'Connell, Hans Vandierendonck | 2025-12-07 | 下载 | Cost-effective and scalable video analytics are essential for precision livestock monitoring, where high-resolution footage and near-real-time monitoring needs from commercial farms generates substant... |
| ELANA: A Simple Energy and Latency Analyzer for LLMs | Hung-Yueh Chiang, Bokun Wang, Diana Marculescu | 2025-12-07 | 下载 | The latency and power consumption of large language models (LLMs) are major constraints when serving them across a wide spectrum of hardware platforms, from mobile edge devices to cloud GPU clusters. |
| A Chunked-Object Pattern for Multi-Region Large Payload Storage in Managed NoSQL Databases | Manideep Reddy Chinthareddy | 2025-12-07 | 下载 | Many managed key-value and NoSQL databases - such as Amazon DynamoDB, Azure Cosmos DB, and Google Cloud Firestore - enforce strict maximum item sizes (e.g., 400 KB in DynamoDB). |
| Cloud Revolution: Tracing the Origins and Rise of Cloud Computing | Deepa Gurung, S M Zia Ur Rashid, Zain ul Abdeen, Suman Rath | 2025-12-07 | 下载 | The history behind the development of cloud computing is more than several decades of technological progress in the fields of virtualization, distributed systems, and high-speed networking, but its cu... |
| Stable-MoE: Lyapunov-based Token Routing for Distributed Mixture-of-Experts Training over Edge Networks | Long Shi, Bingyan Ou, Kang Wei, Weihao Zhu, Zhe Wang, Zhiyong Chen | 2025-12-07 | 下载 | The sparse activation mechanism of mixture of experts (MoE) model empowers edge intelligence with enhanced training efficiency and reduced computational resource consumption. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Managed TLS Under Migration: Authentication Authority Across CDN and Hosting Transitions | Daniyal Ganiuly, Nurzhau Bolatbek, Assel Smaiyl | 2025-12-07 | 下载 | Managed TLS has become a common approach for deploying HTTPS, with platforms generating and storing private keys and automating certificate issuance on behalf of domain operators. |
| Permission Manifests for Web Agents | Samuele Marro, Alan Chan, Xinxing Ren, Lewis Hammond, Jesse Wright, Gurjyot Wanga, Tiziano Piccardi, Nuno Campos, Tobin South, Jialin Yu, Sunando Sengupta, Eric Sommerlade, Alex Pentland, Philip Torr, Jiaxin Pei | 2025-12-07 | 下载 | The rise of Large Language Model (LLM)-based web agents represents a significant shift in automated interactions with the web. Unlike traditional crawlers that follow simple conventions, such as robot... |
| AQUILA: A QUIC-Based Link Architecture for Resilient Long-Range UAV Communication | Ximing Huang, Yirui Rao | 2025-12-07 | 下载 | The proliferation of autonomous Unmanned Aerial Vehicles (UAVs) in Beyond Visual Line of Sight (BVLOS) applications is critically dependent on resilient, high-bandwidth, and low-latency communication ... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Block Sparse Flash Attention | Daniel Ohayon, Itay Lamprecht, Itay Hubara, Israel Cohen, Daniel Soudry, Noam Elata | 2025-12-07 | 下载 | Modern large language models increasingly require long contexts for reasoning and multi-document tasks, but attention's quadratic complexity creates a severe computational bottleneck. |
| Predictive Modeling of I/O Performance for Machine Learning Training Pipelines: A Data-Driven Approach to Storage Optimization | Karthik Prabhakar, Durgamadhab Mishra | 2025-12-07 | 下载 | Modern machine learning training is increasingly bottlenecked by data I/O rather than compute. GPUs often sit idle at below 50% utilization waiting for data. |