Appearance
2025-03-14
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ARCAS: Adaptive Runtime System for Chiplet-Aware Scheduling | Alessandro Fogli, Bo Zhao, Peter Pietzuch, Jana Giceva | 2025-03-14 | 下载 | The growing disparity between CPU core counts and available memory bandwidth has intensified memory contention in servers. This particularly affects highly parallelizable applications, which must achi... |
| Cost-effective Deep Learning Infrastructure with NVIDIA GPU | Aatiz Ghimire, Shahnawaz Alam, Siman Giri, Madhav Prasad Ghimire | 2025-03-14 | 下载 | The growing demand for computational power is driven by advancements in deep learning, the increasing need for big data processing, and the requirements of scientific simulations for academic and rese... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Story of Two GPUs: Characterizing the Resilience of Hopper H100 and Ampere A100 GPUs | Shengkun Cui, Archit Patke, Hung Nguyen, Aditya Ranjan, Ziheng Chen, Phuong Cao, Gregory Bauer, Brett Bode, Catello Di Martino, Saurabh Jha, Chandra Narayanaswami, Daby Sow, Zbigniew T. Kalbarczyk, Ravishankar K. Iyer | 2025-03-14 | 下载 | This study characterizes GPU resilience in Delta, a large-scale AI system that consists of 1,056 A100 and H100 GPUs, with over 1,300 petaflops of peak throughput. We used 2. |
| Performance Analysis of Decentralized Federated Learning Deployments | Chengyan Jiang, Jiamin Fan, Talal Halabi, Israat Haque | 2025-03-14 | 下载 | The widespread adoption of smartphones and smart wearable devices has led to the widespread use of Centralized Federated Learning (CFL) for training powerful machine learning models while preserving d... |
| Supervised Distributed Computing | John Augustine, Christian Scheideler, Julian Werthmann | 2025-03-14 | 下载 | We introduce a new framework for distributed computing that extends and refines the standard master-worker approach of scheduling multi-threaded computations. |
| Finding a Fair Scoring Function for Top- Selection: From Hardness to Practice | Guangya Cai | 2025-03-14 | 下载 | Selecting a subset of the "best" items from a dataset of items, based on a scoring function, is a key task in decision-making. Given the rise of automated decision-making software, it is impor... |
| ARCAS: Adaptive Runtime System for Chiplet-Aware Scheduling | Alessandro Fogli, Bo Zhao, Peter Pietzuch, Jana Giceva | 2025-03-14 | 下载 | The growing disparity between CPU core counts and available memory bandwidth has intensified memory contention in servers. This particularly affects highly parallelizable applications, which must achi... |
| On the Limits of Distributed Quantum Computing | Francesco d'Amore | 2025-03-14 | 下载 | Quantum advantage is well-established in centralized computing, where quantum algorithms can solve certain problems exponentially faster than classical ones. |
| Efficient Distributed MLLM Training with Cornstarch | Insu Jang, Runyu Lu, Nikhil Bansal, Ang Chen, Mosharaf Chowdhury | 2025-03-14 | 下载 | Multimodal large language models (MLLMs) extend the capabilities of large language models (LLMs) by combining heterogeneous model architectures to handle diverse modalities like images and audio. |
| Towards Fine-Grained Scalability for Stateful Stream Processing Systems | Yunfan Qing, Wenli Zheng | 2025-03-14 | 下载 | Dynamic scaling is critical to stream processing engines, as their long-running nature demands adaptive resource management. Existing scaling approaches easily cause performance degradation due to coa... |
| Federated Koopman-Reservoir Learning for Large-Scale Multivariate Time-Series Anomaly Detection | Long Tan Le, Tung-Anh Nguyen, Han Shu, Suranga Seneviratne, Choong Seon Hong, Nguyen H. Tran | 2025-03-14 | 下载 | The proliferation of edge devices has dramatically increased the generation of multivariate time-series (MVTS) data, essential for applications from healthcare to smart cities. |
| Cost-effective Deep Learning Infrastructure with NVIDIA GPU | Aatiz Ghimire, Shahnawaz Alam, Siman Giri, Madhav Prasad Ghimire | 2025-03-14 | 下载 | The growing demand for computational power is driven by advancements in deep learning, the increasing need for big data processing, and the requirements of scientific simulations for academic and rese... |
| LLMPerf: GPU Performance Modeling meets Large Language Models | Khoi N. M. Nguyen, Hoang Duy Nguyen Do, Huyen Thao Le, Thanh Tuan Dao | 2025-03-14 | 下载 | Performance modeling, a pivotal domain in program cost analysis, currently relies on manually crafted models constrained by various program and hardware limitations, especially in the intricate landsc... |
| The Case for ABI Interoperability in a Fault Tolerant MPI | Yao Xu, Grace Nansamba, Anthony Skjellum, Gene Cooperman | 2025-03-14 | 下载 | There is new momentum behind an interoperable ABI for MPI, which will be a major component of MPI-5. This capability brings true separation of concerns to a running MPI computation. |
| Sustainable Grid through Distributed Data Centers: Spinning AI Demand for Grid Stabilization and Optimization | Scott C Evans, Nathan Dahlin, Ibrahima Ndiaye, Sachini Piyoni Ekanayake, Alexander Duncan, Blake Rose, Hao Huang | 2025-03-14 | 下载 | We propose a disruptive paradigm to actively place and schedule TWhrs of parallel AI jobs strategically on the grid, at distributed, grid-aware high performance compute data centers (HPC) capable of u... |
| SmartShards: Churn-Tolerant Continuously Available Distributed Ledger | Joseph Oglio, Mikhail Nesterenko, Gokarna Sharma | 2025-03-14 | 下载 | We present SmartShards: a new sharding algorithm for improving Byzantine tolerance and churn resistance in blockchains. Our algorithm places a peer in multiple shards to create an overlap. |
| Beyond A Single AI Cluster: A Survey of Decentralized LLM Training | Haotian Dong, Jingyan Jiang, Rongwei Lu, Jiajun Luo, Jiajun Song, Bowen Li, Ying Shen, Zhi Wang | 2025-03-14 | 下载 | The emergence of large language models (LLMs) has revolutionized AI development, yet the resource demands beyond a single cluster or even datacenter, limiting accessibility to well-resourced organizat... |
| Power-Aware Scheduling for Multi-Center HPC Electricity Cost Optimization | Abrar Hossain, Abubeker Abdurahman, Mohammad A. Islam, Kishwar Ahmed | 2025-03-14 | 下载 | This paper introduces TARDIS (Temporal Allocation for Resource Distribution using Intelligent Scheduling), a novel power-aware job scheduler for High-Performance Computing (HPC) systems that minimizes... |
| FedOSAA: Improving Federated Learning with One-Step Anderson Acceleration | Xue Feng, M. Paul Laiu, Thomas Strohmer | 2025-03-14 | 下载 | Federated learning (FL) is a distributed machine learning approach that enables multiple local clients and a central server to collaboratively train a model while keeping the data on their own devices... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Performance Analysis of Decentralized Federated Learning Deployments | Chengyan Jiang, Jiamin Fan, Talal Halabi, Israat Haque | 2025-03-14 | 下载 | The widespread adoption of smartphones and smart wearable devices has led to the widespread use of Centralized Federated Learning (CFL) for training powerful machine learning models while preserving d... |
| Enhancing Resiliency of Sketch-based Security via LSB Sharing-based Dynamic Late Merging | Seungsam Yang, Seyed Mohammad Mehdi Mirnajafizadeh, Sian Kim, Rhongho Jang, DaeHun Nyang | 2025-03-14 | 下载 | With the exponentially growing Internet traffic, sketch data structure with a probabilistic algorithm has been expected to be an alternative solution for non-compromised (non-selective) security monit... |
| Scalable Video Conferencing Using SDN Principles | Oliver Michel, Satadal Sengupta, Hyojoon Kim, Ravi Netravali, Jennifer Rexford | 2025-03-14 | 下载 | Video-conferencing applications face an unwavering surge in traffic, stressing their underlying infrastructure in unprecedented ways. This paper rethinks the key building block for conferencing infras... |
| Experimental evaluation of xApp Conflict Mitigation Framework in O-RAN: Insights from Testbed deployment in OTIC | Abida Sultana, Cezary Adamczyk, Mayukh Roy Chowdhury, Adrian Kliks, Aloizio Da Silva | 2025-03-14 | 下载 | Conflict Mitigation (CM) in Open Radio Access Network (O-RAN) is a topic that is gaining importance as commercial O-RAN deployments become more complex. |
| PassiveBLE: Towards Fully Commodity-Compatible BLE Backscatter | Huixin Dong, Yijie Wu, Feiyu Li, Wei Kuang, Yuan He, Qian Zhang, Wei Wang | 2025-03-14 | 下载 | Bluetooth Low Energy (BLE) backscatter is a promising candidate for battery-free Internet of Things (IoT) applications. Unlike existing commodity-level BLE backscatter systems that only enable one-sho... |
| Optimizing 6G Dense Network Deployment for the Metaverse Using Deep Reinforcement Learning | Jie Zhang, Swarna Chetty, Qiao Wang, Chenrui Sun, Paul Daniel Mitchell, David Grace, Hamed Ahmadi | 2025-03-14 | 下载 | As the Metaverse envisions deeply immersive and pervasive connectivity in 6G networks, Integrated Access and Backhaul (IAB) emerges as a critical enabler to meet the demanding requirements of massive ... |
| Sketch Disaggregation Across Time and Space | Jonatan Langlet, Peiqing Chen, Michael Mitzenmacher, Ran Ben Basat, Zaoxing Liu, Gianni Antichi | 2025-03-14 | 下载 | Streaming analytics are essential in a large range of applications, including databases, networking, and machine learning. To optimize performance, practitioners are increasingly offloading such analy... |
| Non Line-of-Sight Optical Wireless Communication using Neuromorphic Cameras | Abbaas Alif Mohamed Nishar, Alireza Marefat, Ashwin Ashok | 2025-03-14 | 下载 | Neuromorphic or event cameras, inspired by biological vision systems, capture changes in illumination with high temporal resolution and efficiency, producing streams of events rather than traditional ... |
| Reliable and Cost-Efficient IoT Connectivity for Smart Agriculture: A Comparative Study of LPWAN, 5G, and Hybrid Connectivity Models | Mohamed Shabeer Mohamed Rafi, Mehran Behjati, Ahmad Sahban Rafsanjani | 2025-03-14 | 下载 | The integration of the Internet of Things (IoT) in smart agriculture has transformed farming practices by enabling real time monitoring, data-driven decision making, and automation. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Cerebrum (AIOS SDK): A Platform for Agent Development, Deployment, Distribution, and Discovery | Balaji Rama, Kai Mei, Yongfeng Zhang | 2025-03-14 | 下载 | Autonomous LLM-based agents have emerged as a powerful paradigm for complex task execution, yet the field lacks standardized tools for development, deployment, distribution and discovery of agents. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ARCAS: Adaptive Runtime System for Chiplet-Aware Scheduling | Alessandro Fogli, Bo Zhao, Peter Pietzuch, Jana Giceva | 2025-03-14 | 下载 | The growing disparity between CPU core counts and available memory bandwidth has intensified memory contention in servers. This particularly affects highly parallelizable applications, which must achi... |
| LLMPerf: GPU Performance Modeling meets Large Language Models | Khoi N. M. Nguyen, Hoang Duy Nguyen Do, Huyen Thao Le, Thanh Tuan Dao | 2025-03-14 | 下载 | Performance modeling, a pivotal domain in program cost analysis, currently relies on manually crafted models constrained by various program and hardware limitations, especially in the intricate landsc... |