Appearance
2025-11-17
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| NL-DPE: An Analog In-memory Non-Linear Dot Product Engine for Efficient CNN and LLM Inference | Lei Zhao, Luca Buonanno, Archit Gajjar, John Moon, Aishwarya Natarajan, Sergey Serebryakov, Ron M. Roth, Xia Sheng, Youtao Zhang, Paolo Faraboschi, Jim Ignowski, Giacomo Pedretti | 2025-11-17 | 下载 | Resistive Random Access Memory (RRAM) based in-memory computing (IMC) accelerators offer significant performance and energy advantages for deep neural networks (DNNs), but face three major limitations... |
| QUILL: An Algorithm-Architecture Co-Design for Cache-Local Deformable Attention | Hyunwoo Oh, Hanning Chen, Sanggeon Yun, Yang Ni, Wenjun Huang, Tamoghno Das, Suyeon Jang, Mohsen Imani | 2025-11-17 | 下载 | Deformable transformers deliver state-of-the-art detection but map poorly to hardware due to irregular memory access and low arithmetic intensity. |
| T-SAR: A Full-Stack Co-design for CPU-Only Ternary LLM Inference via In-Place SIMD ALU Reorganization | Hyunwoo Oh, KyungIn Nam, Rajat Bhattacharjya, Hanning Chen, Tamoghno Das, Sanggeon Yun, Suyeon Jang, Andrew Ding, Nikil Dutt, Mohsen Imani | 2025-11-17 | 下载 | Recent advances in LLMs have outpaced the computational and memory capacities of edge platforms that primarily employ CPUs, thereby challenging efficient and scalable deployment. |
| Coliseum project: Correlating climate change data with the behavior of heritage materials | A Cormier, David Roqui, Fabrice Surma, Martin Labouré, Jean-Marc Vallet, Odile Guillon, N Grozavu, Ann Bourgès | 2025-11-17 | 下载 | Heritage materials are already affected by climate change, and increasing climatic variations reduces the lifespan of monuments. As weathering depends on many factors, it is also difficult to link its... |
| Assessing Large Language Models in Generating RTL Design Specifications | Hung-Ming Huang, Yu-Hsin Yang, Fu-Chieh Chang, Yun-Chia Hsu, Yin-Yu Lin, Ming-Fang Tsai, Chun-Chih Yang, Pei-Yuan Wu | 2025-11-17 | 下载 | As IC design grows more complex, automating comprehension and documentation of RTL code has become increasingly important. Engineers currently should manually interpret existing RTL code and write spe... |
| Think with Self-Decoupling and Self-Verification: Automated RTL Design with Backtrack-ToT | Zhiteng Chao, Yonghao Wang, Xinyu Zhang, Jiaxin Zhou, Tenghui Hua, Husheng Han, Tianmeng Yang, Jianan Mu, Bei Yu, Rui Zhang, Jing Ye, Huawei Li | 2025-11-17 | 下载 | Large language models (LLMs) hold promise for automating integrated circuit (IC) engineering using register transfer level (RTL) hardware description languages (HDLs) like Verilog. |
| Neo: Real-Time On-Device 3D Gaussian Splatting with Reuse-and-Update Sorting Acceleration | Changhun Oh, Seongryong Oh, Jinwoo Hwang, Yoonsung Kim, Hardik Sharma, Jongse Park | 2025-11-17 | 下载 | 3D Gaussian Splatting (3DGS) rendering in real-time on resource-constrained devices is essential for delivering immersive augmented and virtual reality (AR/VR) experiences. |
| Dissecting and Re-architecting 3D NAND Flash PIM Arrays for Efficient Single-Batch Token Generation in LLMs | Yongjoo Jang, Sangwoo Hwang, Hojin Lee, Sangwoo Jung, Donghun Lee, Wonbo Shim, Jaeha Kung | 2025-11-17 | 下载 | The advancement of large language models has led to models with billions of parameters, significantly increasing memory and compute demands. Serving such models on conventional hardware is challenging... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ParallelKittens: Systematic and Practical Simplification of Multi-GPU AI Kernels | Stuart H. Sul, Simran Arora, Benjamin F. Spector, Christopher Ré | 2025-11-17 | 下载 | Inter-GPU communication has become a major bottleneck for modern AI workloads as models scale and improvements in hardware compute throughput outpace improvements in interconnect bandwidth. |
| Comparative Analysis of Large Language Model Inference Serving Systems: A Performance Study of vLLM and HuggingFace TGI | Saicharan Kolluru | 2025-11-17 | 下载 | The deployment of Large Language Models (LLMs) in production environments requires efficient inference serving systems that balance throughput, latency, and resource utilization. |
| Asymptotic analysis of cooperative censoring policies in sensor networks | Jesus Fernandez-Bes, Rocío Arroyo-Valles, Jesús Cid-Sueiro | 2025-11-17 | 下载 | The problem of cooperative data censoring in battery-powered multihop sensor networks is analyzed in this paper. We are interested in scenarios where nodes generate messages (which are related to the ... |
| Do MPI Derived Datatypes Actually Help? A Single-Node Cross-Implementation Study on Shared-Memory Communication | Temitayo Adefemi | 2025-11-17 | 下载 | MPI's derived datatypes (DDTs) promise easier, copy-free communication of non-contiguous data, yet their practical performance remains debated and is often reported only for a single MPI stack. |
| InfoDecom: Decomposing Information for Defending Against Privacy Leakage in Split Inference | Ruijun Deng, Zhihui Lu, Qiang Duan | 2025-11-17 | 下载 | Split inference (SI) enables users to access deep learning (DL) services without directly transmitting raw data. However, recent studies reveal that data reconstruction attacks (DRAs) can recover the ... |
| Distributed Hierarchical Machine Learning for Joint Resource Allocation and Slice Selection in In-Network Edge Systems | Sulaiman Muhammad Rashid, Ibrahim Aliyu, Jaehyung Park, Jinsul Kim | 2025-11-17 | 下载 | The Metaverse promises immersive, real-time experiences; however, meeting its stringent latency and resource demands remains a major challenge. |
| Pico-Cloud: Cloud Infrastructure for Tiny Edge Devices | Mordechai Guri | 2025-11-17 | 下载 | This paper introduces the Pico-Cloud, a micro-edge cloud architecture built on ultra-minimal hardware platforms such as the Raspberry Pi Zero and comparable single-board computers. |
| Learning Process Energy Profiles from Node-Level Power Data | Jonathan Bader, Julius Irion, Jannis Kappel, Joel Witzke, Niklas Fomin, Diellza Sherifi, Odej Kao | 2025-11-17 | 下载 | The growing demand for data center capacity, driven by the growth of high-performance computing, cloud computing, and especially artificial intelligence, has led to a sharp increase in data center ene... |
| MACKO: Sparse Matrix-Vector Multiplication for Low Sparsity | Vladimír Macko, Vladimír Boža | 2025-11-17 | 下载 | Sparse Matrix-Vector Multiplication (SpMV) is a fundamental operation in the inference of sparse Large Language Models (LLMs). Because existing SpMV methods perform poorly under the low and unstructur... |
| On the Fundamental Limits of LLMs at Scale | Muhammad Ahmed Mohsin, Muhammad Umer, Ahsan Bilal, Zeeshan Memon, Muhammad Ibtsaam Qadir, Sagnik Bhattacharya, Hassan Rizwan, Abhiram R. Gorle, Maahe Zehra Kazmi, Nukhba Amir, Ali Subhan, Muhammad Usman Rafique, Zihao He, Pulkit Mehta, Muhammad Ali Jamshed, John M. Cioffi | 2025-11-17 | 下载 | Large Language Models (LLMs) have benefited enormously from scaling, yet these gains are bounded by five fundamental limitations: (1) hallucination, (2) context compression, (3) reasoning degradation,... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A Secure Semantic Communication System Based on Knowledge Graph | Qin Guo, Haonan Tong, Sihua Wang, Peiyuan Si, Jun Zhao, Changchuan Yin | 2025-11-17 | 下载 | This study proposes a novel approach to ensure the security of textual data transmission in a semantic communication system. In the proposed system, a sender transmits textual information to a receive... |
| MM-Telco: Benchmarks and Multimodal Large Language Models for Telecom Applications | Gagan Raj Gupta, Anshul Kumar, Manish Rai, Apu Chakraborty, Ashutosh Modi, Abdelaali Chaoub, Soumajit Pramanik, Moyank Giri, Yashwanth Holla, Sunny Kumar, M. V. Kiran Sooraj | 2025-11-17 | 下载 | Large Language Models (LLMs) have emerged as powerful tools for automating complex reasoning and decision-making tasks. In telecommunications, they hold the potential to transform network optimization... |
| Sensing and Understanding the World over Air: A Large Multimodal Model for Mobile Networks | Zhuoran Duan, Yuhao Wei, Guoshun Nan, Zijun Wang, Yan Yan, Lihua Xiong, Yuhan Ran, Ji Zhang, Jian Li, Qimei Cui, Xiaofeng Tao, Tony Q. S. Quek | 2025-11-17 | 下载 | Large models (LMs), such as ChatGPT, have made a significant impact across diverse domains and hold great potential to facilitate the evolution of network intelligence. |
| Distributed Self-allocated Time Slot Reuse: Multi-hop Communication in Rigid UAV Formations | Amelia Samandari, Andreas Willig, Barry Wu, Philippa Martin | 2025-11-17 | 下载 | Deployment of Unmanned Aerial Vehicles (UAVs) in autonomous formations necessitates accurate and timely communication of safety information. A communication protocol that supports timely and successfu... |
| Indirect Coflow Scheduling | Alexander Lindermayr, Kirk Pruhs, Andréa W. Richa, Tegan Wilson | 2025-11-17 | 下载 | We consider routing in reconfigurable networks, which is also known as coflow scheduling in the literature. The algorithmic literature generally (perhaps implicitly) assumes that the amount of data to... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Sharpe-Driven Stock Selection and Liquidiy-Constrained Portfolio Optimization: Evidence from the Chinese Equity Market | Thanh Nguyen | 2025-11-17 | 下载 | This paper develops and empirically evaluates a Sharpe-driven stock selection and liquidity-constrained portfolio optimization framework designed for the Chinese equity market. |
| Talyxion: From Speculation to Optimization in Risk Managed Crypto Portfolio Allocation | Thanh Nguyen | 2025-11-17 | 下载 | Cryptocurrency trading has attracted tremendous attention from both retail and institutional investors. However, most traders fail to scale their assets under management due to fragile strategies that... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Enabling Heterogeneous Performance Analysis for Scientific Workloads | Maksymilian Graczyk, Vincent Desbiolles, Stefan Roiser, Andrea Guerrieri | 2025-11-17 | 下载 | Heterogeneous computing integrates diverse processing elements, such as CPUs, GPUs, and FPGAs, within a single system, aiming to leverage the strengths of each architecture to optimize performance and... |
| AutoSAGE: Input-Aware CUDA Scheduling for Sparse GNN Aggregation (SpMM/SDDMM) and CSR Attention | Aleksandar Stankovic | 2025-11-17 | 下载 | Sparse GNN aggregations (CSR SpMM/SDDMM) vary widely in performance with degree skew, feature width, and GPU micro-architecture. We present AutoSAGE, an input-aware CUDA scheduler that chooses tiling ... |
| Comparative Analysis of Large Language Model Inference Serving Systems: A Performance Study of vLLM and HuggingFace TGI | Saicharan Kolluru | 2025-11-17 | 下载 | The deployment of Large Language Models (LLMs) in production environments requires efficient inference serving systems that balance throughput, latency, and resource utilization. |
| Hardware optimization on Android for inference of AI models | Iulius Gherasim, Carlos García Sánchez | 2025-11-17 | 下载 | The pervasive integration of Artificial Intelligence models into contemporary mobile computing is notable across numerous use cases, from virtual assistants to advanced image processing. |
| Evaluation of Domain-Specific Architectures for General-Purpose Applications in Apple Silicon | Álvaro Corrochano López, Carlos García Sánchez | 2025-11-17 | 下载 | The rise of AI and its growing computational demands have driven the integration of domain-specific accelerators (such as GPUs, TPUs, and NPUs) across the entire computing infrastructure. |
| KForge: Program Synthesis for Diverse AI Hardware Accelerators | Taras Sereda, Tom St. John, Burak Bartan, Natalie Serrino, Sachin Katti, Zain Asgar | 2025-11-17 | 下载 | GPU kernels are critical for ML performance but difficult to optimize across diverse accelerators. We present KForge, a platform-agnostic framework built on two collaborative LLM-based agents: a gener... |
| Large-scale Multigrid with Adaptive Galerkin Coarsening | Fabian Böhm, Nils Kohl, Harald Köstler, Ulrich Rüde | 2025-11-17 | 下载 | We propose a robust, adaptive coarse-grid correction scheme for matrix-free geometric multigrid targeting PDEs with strongly varying coefficients. |