Appearance
2025-06-12
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| A4: Microarchitecture-Aware LLC Management for Datacenter Servers with Emerging I/O Devices | Haneul Park, Jiaqi Lou, Sangjin Lee, Yifan Yuan, Kyoung Soo Park, Yongseok Son, Ipoom Jeong, Nam Sung Kim | 2025-06-12 | 下载 | In modern server CPUs, the Last-Level Cache (LLC) serves not only as a victim cache for higher-level private caches but also as a buffer for low-latency DMA transfers between CPU cores and I/O devices... |
| Design and Implementation of Washing Machine HUD Using FPGAs | Norman Stites, D. G. Perera | 2025-06-12 | 下载 | In contemporary digital design education, practical field programmable gate array (FPGA) projects are indispensable for bridging theoretical concepts with real-world applications. |
| Controlling quantum chaos via Parrondo strategies on noisy intermediate-scale quantum hardware | Aditi Rath, Dinesh Kumar Panda, Colin Benjamin | 2025-06-12 | 下载 | Advancements in Noisy Intermediate-Scale Quantum (NISQ) computing are steadily pushing these systems toward outperforming classical supercomputers on specific, well-defined computational tasks. |
| MARS: Processing-In-Memory Acceleration of Raw Signal Genome Analysis Inside the Storage Subsystem | Melina Soysal, Konstantina Koliogeorgi, Can Firtina, Nika Mansouri Ghiasi, Rakesh Nadig, Haiyu Mao, Geraldo F. Oliveira, Yu Liang, Klea Zambaku, Mohammad Sadrosadati, Onur Mutlu | 2025-06-12 | 下载 | Raw signal genome analysis (RSGA) has emerged as a promising approach to enable real-time genome analysis by directly analyzing raw electrical signals. |
| Towards Zero-Stall Matrix Multiplication on Energy-Efficient RISC-V Clusters for Machine Learning Acceleration | Luca Colagrande, Lorenzo Leone, Maximilian Coco, Andrei Deaconeasa, Luca Benini | 2025-06-12 | 下载 | The growing computational demands of machine learning (ML) workloads have driven the design of ML accelerators aiming at an optimal tradeoff between efficiency and flexibility. |
| Scalable Software Testing in Fast Virtual Platforms: Leveraging SystemC, QEMU and Containerization | Lukas Jünger, Jan Henrik Weinstock, Tim Kraus | 2025-06-12 | 下载 | The ever-increasing complexity of HW/SW systems presents a persistent challenge, particularly in safety-critical domains like automotive, where extensive testing is imperative. |
| EasyDRAM: An FPGA-based Infrastructure for Fast and Accurate End-to-End Evaluation of Emerging DRAM Techniques | Oğuzhan Canpolat, Ataberk Olgun, David Novo, Oğuz Ergin, Onur Mutlu | 2025-06-12 | 下载 | DRAM is a critical component of modern computing systems. Recent works propose numerous techniques (that we call DRAM techniques) to enhance DRAM-based computing systems' throughput, reliability, and ... |
| CarbonSet: A Dataset to Analyze Trends and Benchmark the Sustainability of CPUs and GPUs | Jiajun Hu, Chetan Choppali Sudarshan, Vidya A. Chhabria, Aman Arora | 2025-06-12 | 下载 | Over the years, the chip industry has consistently developed high-performance processors to address the increasing demands across diverse applications. |
| FloorPlan-DeepSeek (FPDS): A multimodal approach to floorplan generation using vector-based next room prediction | Jun Yin, Pengyu Zeng, Jing Zhong, Peilin Li, Miao Zhang, Ran Luo, Shuai Lu | 2025-06-12 | 下载 | In the architectural design process, floor plan generation is inherently progressive and iterative. However, existing generative models for floor plans are predominantly end-to-end generation that pro... |
| Synchronization for Fault-Tolerant Quantum Computers | Satvik Maurya, Swamit Tannu | 2025-06-12 | 下载 | Quantum Error Correction (QEC) codes store information reliably in logical qubits by encoding them in a larger number of less reliable qubits. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative Decoding | Ziyi Zhang, Ziheng Jiang, Chengquan Jiang, Menghan Yu, Size Zheng, Haibin Lin, Henry Hoffmann, Xin Liu | 2025-06-12 | 下载 | Low-latency decoding for large language models (LLMs) is crucial for applications like chatbots and code assistants, yet generating long outputs remains slow in single-query settings. |
| Adaptive Job Scheduling in Quantum Clouds Using Reinforcement Learning | Waylon Luo, Jiapeng Zhao, Tong Zhan, Qiang Guan | 2025-06-12 | 下载 | Present-day quantum systems face critical bottlenecks, including limited qubit counts, brief coherence intervals, and high susceptibility to errors-all of which obstruct the execution of large and com... |
| The Impact of Partial Computations on the Red-Blue Pebble Game | Pál András Papp, Aleksandros Sobczyk, A. N. Yzelman | 2025-06-12 | 下载 | We study an extension of the well-known red-blue pebble game (RBP) with partial computation steps, inspired by the recent work of Sobczyk. While the original RBP assumes that we need to have all the i... |
| Faster CONGEST Approximation Algorithms for Maximum Weighted Independent Set in Sparse Graphs | Salwa Faour, Fabian Kuhn | 2025-06-12 | 下载 | The maximum independent set problem is a classic optimization problem that has also been studied quite intensively in the distributed setting. |
| Towards Sustainable Computing: Exploring Energy Consumption Efficiency of Alternative Configurations and Workloads in an Open Source Messaging System | Maria Voreakou, George Kousiouris, Mara Nikolaidou | 2025-06-12 | 下载 | Energy consumption in current large scale computing infrastructures is becoming a critical issue, especially with the growing demand for centralized systems such as cloud environments. |
| Deployment of Containerized Simulations in an API-Driven Distributed Infrastructure | Tim Kraus, Axel Sauer, Ingo Feldner | 2025-06-12 | 下载 | The increasingly dynamic market for embedded systems makes virtual prototypes an indispensable tool for hardware/software codesign. The broad acceptance of the methodology has led to a diverse range o... |
| Graph-based Gossiping for Communication Efficiency in Decentralized Federated Learning | Huong Nguyen, Hong-Tri Nguyen, Praveen Kumar Donta, Susanna Pirttikangas, Lauri Lovén | 2025-06-12 | 下载 | Federated learning has emerged as a privacy-preserving technique for collaborative model training across heterogeneously distributed silos. Yet, its reliance on a single central server introduces pote... |
| Model Discovery and Graph Simulation: A Lightweight Gateway to Chaos Engineering | Anatoly A. Krasnovsky | 2025-06-12 | 下载 | Chaos engineering reveals resilience risks but is expensive and operationally risky to run broadly and often. Model-based analyses can estimate dependability, yet in practice they are tricky to build ... |
| 6G Infrastructures for Edge AI: An Analytical Perspective | Kurt Horvath, Shpresa Tuda, Blerta Idrizi, Stojan Kitanov, Fisnik Doko, Dragi Kimovski | 2025-06-12 | 下载 | The convergence of Artificial Intelligence (AI) and the Internet of Things has accelerated the development of distributed, network-sensitive applications, necessitating ultra-low latency, high through... |
| GPU-Accelerated Distributed QAOA on Large-scale HPC Ecosystems | Zhihao Xu, Srikar Chundury, Seongmin Kim, Amir Shehata, Xinyi Li, Ang Li, Tengfei Luo, Frank Mueller, In-Saeng Suh | 2025-06-12 | 下载 | Quantum computing holds great potential to accelerate the process of solving complex combinatorial optimization problems. The Distributed Quantum Approximate Optimization Algorithm (DQAOA) addresses h... |
| HP2C-DT: High-Precision High-Performance Computer-enabled Digital Twin | E. Iraola, M. García-Lorenzo, F. Lordan-Gomis, F. Rossi, E. Prieto-Araujo, R. M. Badia | 2025-06-12 | 下载 | Digital twins are transforming the way we monitor, analyze, and control physical systems, but designing architectures that balance real-time responsiveness with heavy computational demands remains a c... |
| TD-Pipe: Temporally-Disaggregated Pipeline Parallelism Architecture for High-Throughput LLM Inference | Hongbin Zhang, Taosheng Wei, Zhenyi Zheng, Jiangsu Du, Zhiguang Chen, Yutong Lu | 2025-06-12 | 下载 | As the model size continuously increases, pipeline parallelism shows great promise in throughput-oriented LLM inference due to its low demand on communications. |
| Automating Multi-Tenancy Performance Evaluation on Edge Compute Nodes | Joanna Georgiou, Moysis Symeonides, George Pallis, Marios D. Dikaiakos | 2025-06-12 | 下载 | Edge Computing emerges as a promising alternative of Cloud Computing, with scalable compute resources and services deployed in the path between IoT devices and Cloud. |
| A Hybrid Heuristic Framework for Resource-Efficient Querying of Scientific Experiments Data | Mayank Patel, Minal Bhise | 2025-06-12 | 下载 | Scientific experiments and modern applications are generating large amounts of data every day. Most organizations utilize In-house servers or Cloud resources to manage application data and workload. |
| Multi-dimensional Autoscaling of Processing Services: A Comparison of Agent-based Methods | Boris Sedlak, Alireza Furutanpey, Zihang Wang, Víctor Casamayor Pujol, Schahram Dustdar | 2025-06-12 | 下载 | Edge computing breaks with traditional autoscaling due to strict resource constraints, thus, motivating more flexible scaling behaviors using multiple elasticity dimensions. |
| Federated Learning within Global Energy Budget over Heterogeneous Edge Accelerators | Roopkatha Banerjee, Tejus Chandrashekar, Ananth Eswar, Yogesh Simmhan | 2025-06-12 | 下载 | Federated Learning (FL) enables collaborative model training across distributed clients while preserving data privacy. However, optimizing both energy efficiency and model accuracy remains a challenge... |
| HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration | Jiaqi Lv, Xufeng He, Yanchen Liu, Xu Dai, Aocheng Shen, Yinghao Li, Jiachen Hao, Jianrong Ding, Yang Hu, Shouyi Yin | 2025-06-12 | 下载 | The rapid growth of deep learning has driven exponential increases in model parameters and computational demands. NVIDIA GPUs and their CUDA-based software ecosystem provide robust support for paralle... |
| Bug Classification in Quantum Software: A Rule-Based Framework and Its Evaluation | Mir Mohammad Yousuf, Shabir Ahmad Sofi | 2025-06-12 | 下载 | Accurate classification of software bugs is essential for improving software quality. This paper presents a rule-based automated framework for classifying issues in quantum software repositories by bu... |
| Is Sparse Matrix Reordering Effective for Sparse Matrix-Vector Multiplication? | Omid Asudeh, Sina Mahdipour Saravani, Gerald Sabin, Fabrice Rastello, P Sadayappan | 2025-06-12 | 下载 | This work evaluates the impact of sparse matrix reordering on the performance of sparse matrix-vector multiplication across different multicore CPU platforms. |
| Resilience through Automated Adaptive Configuration for Distribution and Replication | Scott D. Stoller, Balaji Jayasankar, Yanhong A. Liu | 2025-06-12 | 下载 | This paper presents a powerful automated framework for making complex systems resilient under failures, by optimized adaptive distribution and replication of interdependent software components across ... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Hardware-Aware Neural Architecture Search for Encrypted Traffic Classification on Resource-Constrained Devices | Adel Chehade, Edoardo Ragusa, Paolo Gastaldo, Rodolfo Zunino | 2025-06-12 | 下载 | This paper presents a hardware-efficient deep neural network (DNN), optimized through hardware-aware neural architecture search (HW-NAS); the DNN supports the classification of session-level encrypted... |
| Jelly: a Fast and Convenient RDF Serialization Format | Piotr Sowinski, Karolina Bogacka, Anastasiya Danilenka, Nikita Kozlov | 2025-06-12 | 下载 | Existing RDF serialization formats such as Turtle, N-Quads, and JSON-LD are widely used for communication and storage in knowledge graph and Semantic Web applications. |
| Decentralized Uplink Adaptive Compression for Cell-Free MIMO with Limited Fronthaul | Zehua Li, Jingjie Wei, Raviraj Adve | 2025-06-12 | 下载 | We study the problem of uplink compression for cell-free multi-input multi-output networks with limited fronthaul capacity. In compress-forward mode, remote radio heads (RRHs) compress the received si... |
| Agentic Semantic Control for Autonomous Wireless Space Networks: Extending Space-O-RAN with MCP-Driven Distributed Intelligence | Eduardo Baena, Paolo Testolina, Michele Polese, Sergi Aliaga, Andrew Benincasa, Dimitrios Koutsonikolas, Josep Jornet, Tommaso Melodia | 2025-06-12 | 下载 | Lunar surface operations impose stringent requirements on wireless communication systems, including autonomy, robustness to disruption, and the ability to adapt to environmental and mission-driven con... |
| Dynamic Beyond 5G and 6G Connectivity: Leveraging NTN and RIS Synergies for Optimized Coverage and Capacity in High-Density Environments | Valdemar Farré, Juan Estrada, David Vega, Luis F Urquiza-Aguiar, Juan A. Vásquez Peralvo, Symeon Chatzinotas | 2025-06-12 | 下载 | The increasing demand for reliable, high-capacity communication during large-scale outdoor events poses significant challenges for traditional Terrestrial Networks (TNs), which often struggle to provi... |
| Energy-Efficient Deep Learning for Traffic Classification on Microcontrollers | Adel Chehade, Edoardo Ragusa, Paolo Gastaldo, Rodolfo Zunino | 2025-06-12 | 下载 | In this paper, we present a practical deep learning (DL) approach for energy-efficient traffic classification (TC) on resource-limited microcontrollers, which are widely used in IoT-based smart system... |
| Large Language Models-Empowered Wireless Networks: Fundamentals, Architecture, and Challenges | Latif U. Khan, Maher Guizani, Sami Muhaidat, Choong Seon Hong | 2025-06-12 | 下载 | The rapid advancement of wireless networks has resulted in numerous challenges stemming from their extensive demands for quality of service towards innovative quality of experience metrics (e.g. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| The Gittins Index: A Design Principle for Decision-Making Under Uncertainty | Ziv Scully, Alexander Terenin | 2025-06-12 | 下载 | The Gittins index is a tool that optimally solves a variety of decision-making problems involving uncertainty, including multi-armed bandit problems, minimizing mean latency in queues, and search prob... |
| LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs | Yanan Cai, Ahmed Salem, Besmira Nushi, Mark Russinovich | 2025-06-12 | 下载 | We introduce LogiPlan, a novel benchmark designed to evaluate the capabilities of large language models (LLMs) in logical planning and reasoning over complex relational structures. |
| A Hybrid Heuristic Framework for Resource-Efficient Querying of Scientific Experiments Data | Mayank Patel, Minal Bhise | 2025-06-12 | 下载 | Scientific experiments and modern applications are generating large amounts of data every day. Most organizations utilize In-house servers or Cloud resources to manage application data and workload. |
| Is Sparse Matrix Reordering Effective for Sparse Matrix-Vector Multiplication? | Omid Asudeh, Sina Mahdipour Saravani, Gerald Sabin, Fabrice Rastello, P Sadayappan | 2025-06-12 | 下载 | This work evaluates the impact of sparse matrix reordering on the performance of sparse matrix-vector multiplication across different multicore CPU platforms. |