Appearance
2025-12-19
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Optimal Software Pipelining and Warp Specialization for Tensor Core GPUs | Rupanshu Soi, Rohan Yadav, Fredrik Kjolstad, Alex Aiken, Maryam Mehri Dehnavi, Michael Garland, Michael Bauer | 2025-12-19 | 下载 | GPU architectures have continued to grow in complexity, with recent incarnations introducing increasingly powerful fixed-function units for matrix multiplication and data movement to accompany highly ... |
| PermuteV: A Performant Side-channel-Resistant RISC-V Core Securing Edge AI Inference | Nuntipat Narkthong, Xiaolin Xu | 2025-12-19 | 下载 | Edge AI inference is becoming prevalent thanks to the emergence of small yet high-performance microprocessors. This shift from cloud to edge processing brings several benefits in terms of energy savin... |
| A 14ns-Latency 9Gb/s 0.44mm 62pJ/b Short-Blocklength LDPC Decoder ASIC in 22FDX | Darja Nonaca, Jérémy Guichemerre, Reinhard Wiesmayr, Nihat Engin Tunali, Christoph Studer | 2025-12-19 | 下载 | Ultra-reliable low latency communication (URLLC) is a key part of 5G wireless systems. Achieving low latency necessitates codes with short blocklengths for which polar codes with successive cancellati... |
| LLM-based Behaviour Driven Development for Hardware Design | Rolf Drechsler, Qian Liu | 2025-12-19 | 下载 | Test and verification are essential activities in hardware and system design, but their complexity grows significantly with increasing system sizes. |
| Torrent: A Distributed DMA for Efficient and Flexible Point-to-Multipoint Data Movement | Yunhao Deng, Fanchen Kong, Xiaoling Yi, Ryan Antonio, Marian Verhelst | 2025-12-19 | 下载 | The growing disparity between computational power and on-chip communication bandwidth is a critical bottleneck in modern Systems-on-Chip (SoCs), especially for data-parallel workloads like AI. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Constrained Cuts, Flows, and Lattice-Linearity | Robert Streit, Vijay K. Garg | 2025-12-19 | 下载 | In a capacitated directed graph, it is known that the set of all min-cuts forms a distributive lattice [1], [2]. Here, we describe this lattice as a regular predicate whose forbidden elements can be a... |
| ACE-Sync: An Adaptive Cloud-Edge Synchronization Framework for Communication-Efficient Large-Scale Distributed Model Training | Yi Yang, Ziyu Lin, Liesheng Wei | 2025-12-19 | 下载 | Large-scale deep learning models impose substantial communication overh ead in distributed training, particularly in bandwidth-constrained or heterogeneous clo ud-edge environments. |
| Asymptotic behaviour of galactic small-scale dynamos at modest magnetic Prandtl number | Frederick A. Gent, Mordecai-Mark Mac Low, Maarit J. Korpi-Lagg, Touko Puro, Matthias Reinhardt | 2025-12-19 | 下载 | Magnetic fields are critical at many scales to galactic dynamics and structure, including multiphase pressure balance, dust processing, and star formation. |
| Torrent: A Distributed DMA for Efficient and Flexible Point-to-Multipoint Data Movement | Yunhao Deng, Fanchen Kong, Xiaoling Yi, Ryan Antonio, Marian Verhelst | 2025-12-19 | 下载 | The growing disparity between computational power and on-chip communication bandwidth is a critical bottleneck in modern Systems-on-Chip (SoCs), especially for data-parallel workloads like AI. |
| Enabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharing | Lingxiao Zhao, Haoran Zhou, Yuezhi Che, Dazhao Cheng | 2025-12-19 | 下载 | Multimodal large language models (MLLMs) extend LLMs with visual understanding through a three-stage pipeline: multimodal preprocessing, vision encoding, and LLM inference. |
| iOS as Acceleration | Alexander K. Chen | 2025-12-19 | 下载 | Practical utilization of large-scale machine learning requires a powerful compute setup, a necessity which poses a significant barrier to engagement with such artificial intelligence in more restricte... |
| The HEAL Data Platform | Brienna M. Larrick, L. Philip Schumm, Mingfei Shao, Craig Barnes, Anthony Juehne, Hara Prasad Juvvla, Michael B. Kranz, Michael Lukowski, Clint Malson, Jessica N. Mazerik, Christopher G. Meyer, Jawad Qureshi, Erin Spaniol, Andrea Tentner, Alexander VanTol, Peter Vassilatos, Sara Volk de Garcia, Robert L. Grossman | 2025-12-19 | 下载 | Objective: The objective was to develop a cloud-based, federated system to serve as a single point of search, discovery and analysis for data generated under the NIH Helping to End Addiction Long-term... |
| Democratizing Scalable Cloud Applications: Transactional Stateful Functions on Streaming Dataflows | Kyriakos Psarakis | 2025-12-19 | 下载 | Web applications underpin much of modern digital life, yet building scalable and consistent cloud applications remains difficult, requiring expertise across cloud computing, distributed systems, datab... |
| Adaptive Graph Pruning with Sudden-Events Evaluation for Traffic Prediction using Online Semi-Decentralized ST-GNNs | Ivan Kralj, Lodovico Giaretta, Gordan Ježić, Ivana Podnar Žarko, Šarūnas Girdzijauskas | 2025-12-19 | 下载 | Spatio-Temporal Graph Neural Networks (ST-GNNs) are well-suited for processing high-frequency data streams from geographically distributed sensors in smart mobility systems. |
| Scalable Distributed Vector Search via Accuracy Preserving Index Construction | Yuming Xu, Qianxi Zhang, Qi Chen, Baotong Lu, Menghao Li, Philip Adams, Mingqin Li, Zengzhong Li, Jing Liu, Cheng Li, Fan Yang | 2025-12-19 | 下载 | Scaling Approximate Nearest Neighbor Search (ANNS) to billions of vectors requires distributed indexes that balance accuracy, latency, and throughput. |
| Practical Framework for Privacy-Preserving and Byzantine-robust Federated Learning | Baolei Zhang, Minghong Fang, Zhuqing Liu, Biao Yi, Peizhao Zhou, Yuan Wang, Tong Li, Zheli Liu | 2025-12-19 | 下载 | Federated Learning (FL) allows multiple clients to collaboratively train a model without sharing their private data. However, FL is vulnerable to Byzantine attacks, where adversaries manipulate client... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| STAR: Semantic-Traffic Alignment and Retrieval for Zero-Shot HTTPS Website Fingerprinting | Yifei Cheng, Yujia Zhu, Baiyang Li, Xinhao Deng, Yitong Cai, Yaochen Ren, Qingyun Liu | 2025-12-19 | 下载 | Modern HTTPS mechanisms such as Encrypted Client Hello (ECH) and encrypted DNS improve privacy but remain vulnerable to website fingerprinting (WF) attacks, where adversaries infer visited sites from ... |
| Binding Agent ID: Unleashing the Power of AI Agents with accountability and credibility | Zibin Lin, Shengli Zhang, Guofu Liao, Dacheng Tao, Taotao Wang | 2025-12-19 | 下载 | Autonomous AI agents lack traceable accountability mechanisms, creating a fundamental dilemma where systems must either operate as ``downgraded tools'' or risk real-world abuse. |
| A decomposition approach for large virtual network embedding | Amal Benhamiche, Pierre Fouilhoux, Lucas Létocart, Nancy Perrot, Alexis Schneider | 2025-12-19 | 下载 | Virtual Network Embedding (VNE) is the core combinatorial problem of Network Slicing, a 5G technology which enables telecommunication operators to propose diverse service-dedicated virtual networks, e... |
| Timely Information Updating for Mobile Devices Without and With ML Advice | Yu-Pin Hsu, Yi-Hsuan Tseng | 2025-12-19 | 下载 | This paper investigates an information update system in which a mobile device monitors a physical process and sends status updates to an access point (AP). |
| Quantum-enhanced Information Retrieval from Reflective Intelligent Surfaces | Shiqian Guo, Tingxiang Ji, Jianqing Liu | 2025-12-19 | 下载 | Information retrieval from passive backscatter systems is widely used in digital applications with tight energy budgets, short communication distances, and low data rates. |
| Enhancing AIGC Service Efficiency with Adaptive Multi-Edge Collaboration in A Distributed System | Changfu Xu, Jianxiong Guo, Jiandian Zeng, Houming Qiu, Tian Wang, Xiaowen Chu, Jiannong Cao | 2025-12-19 | 下载 | The Artificial Intelligence Generated Content (AIGC) technique has gained significant traction for producing diverse content. However, existing AIGC services typically operate within a centralized fra... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| On General Linearly Implicit Quantized State System Methods | Mariana Bergonzi, Joaquín Fernández, Ernesto Kofman | 2025-12-19 | 下载 | This work proposes a methodology to develop new numerical integration algorithms for ordinary differential equations based on state quantization, generalizing the notions of Linearly Implicit Quantize... |
| GreedySnake: Accelerating SSD-Offloaded LLM Training with Efficient Scheduling and Optimizer Step Overlapping | Yishu Yin, Xuehai Qian | 2025-12-19 | 下载 | SSD-offloaded training offers a practical and promising approach to making LLM training cost-effective. Building on gradient accumulation with micro-batches, this paper introduces GreedySnake, a new S... |