Appearance
2025-06-17
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Scaling Intelligence: Designing Data Centers for Next-Gen Language Models | Jesmin Jahan Tithi, Hanjiang Wu, Avishaii Abuhatzera, Fabrizio Petrini | 2025-06-17 | 下载 | The explosive growth of Large Language Models (LLMs), such as GPT-4 with 1.8 trillion parameters, demands a fundamental rethinking of data center architecture to ensure scalability, efficiency, and co... |
| ASAP-FE: Energy-Efficient Feature Extraction Enabling Multi-Channel Keyword Spotting on Edge Processors | Jongin Choi, Jina Park, Woojoo Lee, Jae-Jin Lee, Massoud Pedram | 2025-06-17 | 下载 | Multi-channel keyword spotting (KWS) has become crucial for voice-based applications in edge environments. However, its substantial computational and energy requirements pose significant challenges. |
| Guaranteed Guess: A Language Modeling Approach for CISC-to-RISC Transpilation with Testing Guarantees | Ahmed Heakl, Sarim Hashmi, Chaimaa Abi, Celine Lee, Abdulrahman Mahmoud | 2025-06-17 | 下载 | The hardware ecosystem is rapidly evolving, with increasing interest in translating low-level programs across different instruction set architectures (ISAs) in a quick, flexible, and correct way to en... |
| Empirically-Calibrated H100 Node Power Models for Reducing Uncertainty in AI Training Energy Estimation | Alex C. Newkirk, Jared Fernandez, Jonathan Koomey, Imran Latif, Emma Strubell, Arman Shehabi, Constantine Samaras | 2025-06-17 | 下载 | As AI's energy demand continues to grow, it is critical to enhance the understanding of characteristics of this demand, to improve grid infrastructure planning and environmental assessment. |
| Tensor Manipulation Unit (TMU): Reconfigurable, Near-Memory Tensor Manipulation for High-Throughput AI SoC | Weiyu Zhou, Zheng Wang, Chao Chen, Yike Li, Yongkui Yang, Zhuoyu Wu, Anupam Chattopadhyay | 2025-06-17 | 下载 | While recent advances in AI SoC design have focused heavily on accelerating tensor computation, the equally critical task of tensor manipulation, centered on high,volume data movement with minimal com... |
| Comprehensive Verilog Design Problems: A Next-Generation Benchmark Dataset for Evaluating Large Language Models and Agents on RTL Design and Verification | Nathaniel Pinckney, Chenhui Deng, Chia-Tung Ho, Yun-Da Tsai, Mingjie Liu, Wenfei Zhou, Brucek Khailany, Haoxing Ren | 2025-06-17 | 下载 | We present the Comprehensive Verilog Design Problems (CVDP) benchmark, a new dataset and infrastructure to advance LLM and agent research in hardware design and verification. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Scaling Intelligence: Designing Data Centers for Next-Gen Language Models | Jesmin Jahan Tithi, Hanjiang Wu, Avishaii Abuhatzera, Fabrizio Petrini | 2025-06-17 | 下载 | The explosive growth of Large Language Models (LLMs), such as GPT-4 with 1.8 trillion parameters, demands a fundamental rethinking of data center architecture to ensure scalability, efficiency, and co... |
| Zarr-Based Chunk-Level Cumulative Sums in Reduced Dimensions | Hailiang Zhang, Dieu My T. Nguyen, Christine Smit, Mahabal Hegde | 2025-06-17 | 下载 | Data analysis on massive multi-dimensional data, such as high-resolution large-region time averaging or area averaging for geospatial data, often involves calculations over a significant number of dat... |
| Utility-Driven Speculative Decoding for Mixture-of-Experts | Anish Saxena, Po-An Tsai, Hritvik Taneja, Aamer Jaleel, Moinuddin Qureshi | 2025-06-17 | 下载 | GPU memory bandwidth is the main bottleneck for low-latency Large Language Model (LLM) inference. Speculative decoding leverages idle GPU compute by using a lightweight drafter to propose K tokens, wh... |
| Event-Driven Online Vertical Federated Learning | Ganyu Wang, Boyu Wang, Bin Gu, Charles Ling | 2025-06-17 | 下载 | Online learning is more adaptable to real-world scenarios in Vertical Federated Learning (VFL) compared to offline learning. However, integrating online learning into VFL presents challenges due to th... |
| Scalable GPU Performance Variability Analysis framework | Ankur Lahiry, Ayush Pokharel, Seth Ockerman, Amal Gueroudji, Line Pouchard, Tanzima Z. Islam | 2025-06-17 | 下载 | Analyzing large-scale performance logs from GPU profilers often requires terabytes of memory and hours of runtime, even for basic summaries. These constraints prevent timely insight and hinder the int... |
| Resource Optimization with MPI Process Malleability for Dynamic Workloads in HPC Clusters | Sergio Iserte, Iker Martín-Álvarez, Krzysztof Rojek, José I. Aliaga, Maribel Castillo, Weronika Folwarska, Antonio J. Peña | 2025-06-17 | 下载 | Dynamic resource management is essential for optimizing computational efficiency in modern high-performance computing (HPC) environments, particularly as systems scale. |
| SETI@home: Data Acquisition and Front-End Processing | Eric J. Korpela, David P. Anderson, Jeff Cobb, Matt Lebofsky, Wei Liu, Dan Werthimer | 2025-06-17 | 下载 | SETI@home is a radio Search for Extraterrestrial Intelligence (SETI) project, looking for technosignatures in data recorded at multiple observatories from 1998 to 2020. |
| ClusterRCA: An End-to-End Approach for Network Fault Localization and Classification for HPC System | Yongqian Sun, Xijie Pan, Xiao Xiong, Lei Tao, Jiaju Wang, Shenglin Zhang, Yuan Yuan, Yuqi Li, Kunlin Jian | 2025-06-17 | 下载 | Network failure diagnosis is challenging yet critical for high-performance computing (HPC) systems. Existing methods cannot be directly applied to HPC scenarios due to data heterogeneity and lack of a... |
| Keigo: Co-designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-aware Storage Hierarchy (Extended Version) | Rúben Adão, Zhongjie Wu, Changjun Zhou, Oana Balmau, João Paulo, Ricardo Macedo | 2025-06-17 | 下载 | We present Keigo, a concurrency- and workload-aware storage middleware that enhances the performance of log-structured merge key-value stores (LSM KVS) when they are deployed on a hierarchy of storage... |
| Concepts for designing modern C++ interfaces for MPI | C. Nicole Avans, Alfredo A. Correa, Sayan Ghosh, Matthias Schimek, Joseph Schuchart, Anthony Skjellum, Evan D. Suggs, Tim Niklas Uhl | 2025-06-17 | 下载 | Since the C++ bindings were deleted in 2008, the Message Passing Interface (MPI) community has revived efforts in building high-level modern C++ interfaces. |
| Consensus Power Inequality: A Comparative Study of Blockchain Networks | Kamil Tylinski, Abylay Satybaldy, Paolo Tasca | 2025-06-17 | 下载 | The distribution of consensus power is a cornerstone of decentralisation, influencing the security, resilience, and fairness of blockchain networks while ensuring equitable impact among participants. |
| Convergence-Privacy-Fairness Trade-Off in Personalized Federated Learning | Xiyu Zhao, Qimei Cui, Weicai Li, Wei Ni, Ekram Hossain, Quan Z. Sheng, Xiaofeng Tao, Ping Zhang | 2025-06-17 | 下载 | Personalized federated learning (PFL), e.g., the renowned Ditto, strikes a balance between personalization and generalization by conducting federated learning (FL) to guide personalized learning (PL). |
| A Novel Indicator for Quantifying and Minimizing Information Utility Loss of Robot Teams | Xiyu Zhao, Qimei Cui, Wei Ni, Quan Z. Sheng, Abbas Jamalipour, Guoshun Nan, Xiaofeng Tao, Ping Zhang | 2025-06-17 | 下载 | The timely exchange of information among robots within a team is vital, but it can be constrained by limited wireless capacity. The inability to deliver information promptly can result in estimation e... |
| The Redundancy of Full Nodes in Bitcoin: A Network-Theoretic Demonstration of Miner-Centric Propagation Topologies | Dr Craig S Wright | 2025-06-17 | 下载 | This paper formally examines the network structure of Bitcoin CORE (BTC) and Bitcoin Satoshi Vision (BSV) using complex graph theory to demonstrate that home-hosted full nodes are incapable of partici... |
| Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents | Qizheng Zhang, Michael Wornow, Gerry Wan, Kunle Olukotun | 2025-06-17 | 下载 | LLM-based agent applications have shown increasingly remarkable capabilities in complex workflows but incur substantial costs and latency due to extensive planning and reasoning requirements. |
| Efficient Serving of LLM Applications with Probabilistic Demand Modeling | Yifei Liu, Zuo Gan, Zhenghao Gan, Weiye Wang, Chen Chen, Yizhou Shan, Xusheng Chen, Zhenhua Han, Yifei Zhu, Shixuan Sun, Minyi Guo | 2025-06-17 | 下载 | Applications based on Large Language Models (LLMs) contains a series of tasks to address real-world problems with boosted capability, which have dynamic demand volumes on diverse backends. |
| Déjà Vu: Efficient Video-Language Query Engine with Learning-based Inter-Frame Computation Reuse | Jinwoo Hwang, Daeun Kim, Sangyeop Lee, Yoonsung Kim, Guseul Heo, Hojoon Kim, Yunseok Jeong, Tadiwos Meaza, Eunhyeok Park, Jeongseob Ahn, Jongse Park | 2025-06-17 | 下载 | Recently, Video-Language Models (VideoLMs) have demonstrated remarkable capabilities, offering significant potential for flexible and powerful video query systems. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| GCN-Driven Reinforcement Learning for Probabilistic Real-Time Guarantees in Industrial URLLC | Eman Alqudah, Ashfaq Khokhar | 2025-06-17 | 下载 | Ensuring packet-level communication quality is vital for ultra-reliable, low-latency communications (URLLC) in large-scale industrial wireless networks. |
| CNN-Enabled Scheduling for Probabilistic Real-Time Guarantees in Industrial URLLC | Eman Alqudah, Ashfaq Khokhar | 2025-06-17 | 下载 | Ensuring packet-level communication quality is vital for ultra-reliable, low-latency communications (URLLC) in large-scale industrial wireless networks. |
| Determinação Automática de Limiar de Detecção de Ataques em Redes de Computadores Utilizando Autoencoders | Luan Gonçalves Miranda, Pedro Ivo da Cruz, Murilo Bellezoni Loiola | 2025-06-17 | 下载 | Currently, digital security mechanisms like Anomaly Detection Systems using Autoencoders (AE) show great potential for bypassing problems intrinsic to the data, such as data imbalance. |
| Vulnerability Disclosure or Notification? Best Practices for Reaching Stakeholders at Scale | Ting-Han Chen, Jeroen van der Ham-de Vos | 2025-06-17 | 下载 | Security researchers are interested in security vulnerabilities, but these security vulnerabilities create risks for stakeholders. Coordinated Vulnerability Disclosure has been an accepted best practi... |
| A Novel Dynamic Bandwidth Allocation Design for 100G Coherent Passive Optical Network | Rujia Zou, Haipeng Zhang, Karthik Sundaresan, Zhensheng Jia, Suresh Subramaniam | 2025-06-17 | 下载 | With the rapid advancements in coherent Passive Optical Network (PON) technologies featuring 100G and higher data rates, this paper addresses the urgent requirement for sophisticated simulation and MA... |
| Optimizing System Latency for Blockchain-Encrypted Edge Computing in Internet of Vehicles | Cui Zhang, Maoxin Ji, Qiong Wu, Pingyi Fan, Qiang Fan | 2025-06-17 | 下载 | As Internet of Vehicles (IoV) technology continues to advance, edge computing has become an important tool for assisting vehicles in handling complex tasks. |
| TraGe: A Generic Packet Representation for Traffic Classification Based on Header-Payload Differences | Chungang Lin, Yilong Jiang, Weiyao Zhang, Xuying Meng, Tianyu Zuo, Yujun Zhang | 2025-06-17 | 下载 | Traffic classification has a significant impact on maintaining the Quality of Service (QoS) of the network. Since traditional methods heavily rely on feature extraction and large scale labeled data, s... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Scaling Intelligence: Designing Data Centers for Next-Gen Language Models | Jesmin Jahan Tithi, Hanjiang Wu, Avishaii Abuhatzera, Fabrizio Petrini | 2025-06-17 | 下载 | The explosive growth of Large Language Models (LLMs), such as GPT-4 with 1.8 trillion parameters, demands a fundamental rethinking of data center architecture to ensure scalability, efficiency, and co... |
| Determinação Automática de Limiar de Detecção de Ataques em Redes de Computadores Utilizando Autoencoders | Luan Gonçalves Miranda, Pedro Ivo da Cruz, Murilo Bellezoni Loiola | 2025-06-17 | 下载 | Currently, digital security mechanisms like Anomaly Detection Systems using Autoencoders (AE) show great potential for bypassing problems intrinsic to the data, such as data imbalance. |
| Scalable GPU Performance Variability Analysis framework | Ankur Lahiry, Ayush Pokharel, Seth Ockerman, Amal Gueroudji, Line Pouchard, Tanzima Z. Islam | 2025-06-17 | 下载 | Analyzing large-scale performance logs from GPU profilers often requires terabytes of memory and hours of runtime, even for basic summaries. These constraints prevent timely insight and hinder the int... |
| Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents | Qizheng Zhang, Michael Wornow, Gerry Wan, Kunle Olukotun | 2025-06-17 | 下载 | LLM-based agent applications have shown increasingly remarkable capabilities in complex workflows but incur substantial costs and latency due to extensive planning and reasoning requirements. |