Appearance
2024-06-20
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Exploring DRAM Cache Prefetching for Pooled Memory | Chandrahas Tirumalasetty, Narasimha Annapreddy | 2024-06-20 | 下载 | Hardware based memory pooling enabled by interconnect standards like CXL have been gaining popularity amongst cloud providers and system integrators. |
| WAGONN: Weight Bit Agglomeration in Crossbar Arrays for Reduced Impact of Interconnect Resistance on DNN Inference Accuracy | Jeffry Victor, Dong Eun Kim, Chunguang Wang, Kaushik Roy, Sumeet Gupta | 2024-06-20 | 下载 | Deep neural network (DNN) accelerators employing crossbar arrays capable of in-memory computing (IMC) are highly promising for neural computing platforms. |
| Scalable and RISC-V Programmable Near-Memory Computing Architectures for Edge Nodes | Michele Caon, Clément Choné, Pasquale Davide Schiavone, Alexandre Levisse, Guido Masera, Maurizio Martina, David Atienza | 2024-06-20 | 下载 | The widespread adoption of data-centric algorithms, particularly Artificial Intelligence (AI) and Machine Learning (ML), has exposed the limitations of centralized processing infrastructures, driving ... |
| COOK Access Control on an embedded Volta GPU | Benjamin Lesage, Frédéric Boniol, Claire Pagetti | 2024-06-20 | 下载 | The last decade has seen the emergence of a new generation of multi-core in response to advances in machine learning, and in particular Deep Neural Network (DNN) training and inference tasks. |
| AMC: Access to Miss Correlation Prefetcher for Evolving Graph Analytics | Abhishek Singh, Christian Schulte, Xiaochen Guo | 2024-06-20 | 下载 | Modern memory hierarchies work well with applications that have good spatial locality. Evolving (dynamic) graphs are important applications widely used to model graphs and networks with edge and verte... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Suki: Choreographed Distributed Dataflow in Rust | Shadaj Laddad, Alvin Cheung, Joseph M. Hellerstein | 2024-06-20 | 下载 | Programming models for distributed dataflow have long focused on analytical workloads that allow the runtime to dynamically place and schedule compute logic. |
| Vahana.jl -- A framework (not only) for large-scale agent-based models | Steffen Fürst, Tim Conrad, Carlo Jaeger, Sarah Wolf | 2024-06-20 | 下载 | Agent-based models (ABMs) offer a powerful framework for understanding complex systems. However, their computational demands often become a significant barrier as the number of agents and complexity o... |
| CascadeServe: Unlocking Model Cascades for Inference Serving | Ferdi Kossmann, Ziniu Wu, Alex Turk, Nesime Tatbul, Lei Cao, Samuel Madden | 2024-06-20 | 下载 | Machine learning (ML) models are increasingly deployed to production, calling for efficient inference serving systems. Efficient inference serving is complicated by two challenges: (i) ML models incur... |
| Communication-efficient Vertical Federated Learning via Compressed Error Feedback | Pedro Valdeira, João Xavier, Cláudia Soares, Yuejie Chi | 2024-06-20 | 下载 | Communication overhead is a known bottleneck in federated learning (FL). To address this, lossy compression is commonly used on the information communicated between the server and clients during train... |
| Safety-Critical Edge Robotics Architecture with Bounded End-to-End Latency | Gautam Gala, Tilmann Unte, Luiz Maia, Johannes Kühbacher, Isser Kadusale, Mohammad Ibrahim Alkoudsi, Gerhard Fohler, Sebastian Altmeyer | 2024-06-20 | 下载 | Edge computing processes data near its source, reducing latency and enhancing security compared to traditional cloud computing while providing its benefits. |
| AI-coupled HPC Workflow Applications, Middleware and Performance | Wes Brewer, Ana Gainaru, Frédéric Suter, Feiyi Wang, Murali Emani, Shantenu Jha | 2024-06-20 | 下载 | AI integration is revolutionizing the landscape of HPC simulations, enhancing the importance, use, and performance of AI-driven HPC workflows. |
| NAC-QFL: Noise Aware Clustered Quantum Federated Learning | Himanshu Sahu, Hari Prabhat Gupta | 2024-06-20 | 下载 | Recent advancements in quantum computing, alongside successful deployments of quantum communication, hold promises for revolutionizing mobile networks. |
| Failure-Resilient Distributed Inference with Model Compression over Heterogeneous Edge Devices | Li Wang, Liang Li, Lianming Xu, Xian Peng, Aiguo Fei | 2024-06-20 | 下载 | The distributed inference paradigm enables the computation workload to be distributed across multiple devices, facilitating the implementations of deep learning based intelligent services on extremely... |
| ReaL: Efficient RLHF Training of Large Language Models with Parameter Reallocation | Zhiyu Mei, Wei Fu, Kaiwei Li, Guangju Wang, Huanchen Zhang, Yi Wu | 2024-06-20 | 下载 | Reinforcement Learning from Human Feedback (RLHF) is a pivotal technique for empowering large language model (LLM) applications. Compared with the supervised training process of LLMs, the RLHF trainin... |
| Reducing Memory Contention and I/O Congestion for Disk-based GNN Training | Qisheng Jiang, Lei Jia, Chundong Wang | 2024-06-20 | 下载 | Graph neural networks (GNNs) gain wide popularity. Large graphs with high-dimensional features become common and training GNNs on them is non-trivial on an ordinary machine. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Adaptive Compression of Massive MIMO Channel State Information with Deep Learning | Faris B. Mismar, Aliye Özge Kaya | 2024-06-20 | 下载 | This paper proposes the use of deep autoencoders to compress the channel information in a \review{massive} multiple input and multiple output (MIMO) system. |
| QuIP: A P4 Quantum Internet Protocol Prototyping Framework | Wojciech Kozlowski, Fernando A. Kuipers, Rob Smets, Belma Turkovic | 2024-06-20 | 下载 | Quantum entanglement is so fundamentally different from a network packet that several quantum network stacks have been proposed; one of which has even been experimentally demonstrated. |
| Age of Information Versions: a Semantic View of Markov Source Monitoring | Mehrdad Salimnejad, Marios Kountouris, Anthony Ephremides, Nikolaos Pappas | 2024-06-20 | 下载 | We consider the problem of real-time remote monitoring of a two-state Markov process, where a sensor observes the state of the source and makes a decision on whether to transmit the status updates ove... |
| Online Learning of Weakly Coupled MDP Policies for Load Balancing and Auto Scaling | S. R. Eshwar, Lucas Lopes Felipe, Alexandre Reiffers-Masson, Daniel Sadoc Menasché, Gugan Thoppe | 2024-06-20 | 下载 | Load balancing and auto scaling are at the core of scalable, contemporary systems, addressing dynamic resource allocation and service rate adjustments in response to workload changes. |
| Leveraging eBPF and AI for Ransomware Nose Out | Arjun Sekar, Sameer G. Kulkarni, Joy Kuri | 2024-06-20 | 下载 | In this work, we propose a two-phased approach for real-time detection and deterrence of ransomware. To achieve this, we leverage the capabilities of eBPF (Extended Berkeley Packet Filter) and artific... |
| Hierarchical Micro-Segmentations for Zero-Trust Services via Large Language Model (LLM)-enhanced Graph Diffusion | Yinqiu Liu, Guangyuan Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim, Xuemin Shen | 2024-06-20 | 下载 | In the rapidly evolving Next-Generation Networking (NGN) era, the adoption of zero-trust architectures has become increasingly crucial to protect security. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| CEBench: A Benchmarking Toolkit for the Cost-Effectiveness of LLM Pipelines | Wenbo Sun, Jiaqi Wang, Qiming Guo, Ziyu Li, Wenlu Wang, Rihan Hai | 2024-06-20 | 下载 | Online Large Language Model (LLM) services such as ChatGPT and Claude 3 have transformed business operations and academic research by effortlessly enabling new opportunities. |
| Queen: A quick, scalable, and comprehensive quantum circuit simulation for supercomputing | Chuan-Chi Wang, Yu-Cheng Lin, Yan-Jie Wang, Chia-Heng Tu, Shih-Hao Hung | 2024-06-20 | 下载 | The state vector-based simulation offers a convenient approach to developing and validating quantum algorithms with noise-free results. However, limited by the absence of cache-aware implementations a... |
| TurboSpec: Closed-loop Speculation Control System for Optimizing LLM Serving Goodput | Xiaoxuan Liu, Jongseok Park, Langxiang Hu, Woosuk Kwon, Zhuohan Li, Chen Zhang, Kuntai Du, Xiangxi Mo, Kaichao You, Alvin Cheung, Zhijie Deng, Ion Stoica, Hao Zhang | 2024-06-20 | 下载 | Large Language Model (LLM) serving systems batch concurrent user requests to achieve efficient serving. However, in real-world deployments, such inter-request parallelism from batching is often limite... |