Skip to content

2025-06-12

cs.AR - Architecture

标题作者发布日期PDF摘要
A4: Microarchitecture-Aware LLC Management for Datacenter Servers with Emerging I/O DevicesHaneul Park, Jiaqi Lou, Sangjin Lee, Yifan Yuan, Kyoung Soo Park, Yongseok Son, Ipoom Jeong, Nam Sung Kim2025-06-12下载In modern server CPUs, the Last-Level Cache (LLC) serves not only as a victim cache for higher-level private caches but also as a buffer for low-latency DMA transfers between CPU cores and I/O devices...
Design and Implementation of Washing Machine HUD Using FPGAsNorman Stites, D. G. Perera2025-06-12下载In contemporary digital design education, practical field programmable gate array (FPGA) projects are indispensable for bridging theoretical concepts with real-world applications.
Controlling quantum chaos via Parrondo strategies on noisy intermediate-scale quantum hardwareAditi Rath, Dinesh Kumar Panda, Colin Benjamin2025-06-12下载Advancements in Noisy Intermediate-Scale Quantum (NISQ) computing are steadily pushing these systems toward outperforming classical supercomputers on specific, well-defined computational tasks.
MARS: Processing-In-Memory Acceleration of Raw Signal Genome Analysis Inside the Storage SubsystemMelina Soysal, Konstantina Koliogeorgi, Can Firtina, Nika Mansouri Ghiasi, Rakesh Nadig, Haiyu Mao, Geraldo F. Oliveira, Yu Liang, Klea Zambaku, Mohammad Sadrosadati, Onur Mutlu2025-06-12下载Raw signal genome analysis (RSGA) has emerged as a promising approach to enable real-time genome analysis by directly analyzing raw electrical signals.
Towards Zero-Stall Matrix Multiplication on Energy-Efficient RISC-V Clusters for Machine Learning AccelerationLuca Colagrande, Lorenzo Leone, Maximilian Coco, Andrei Deaconeasa, Luca Benini2025-06-12下载The growing computational demands of machine learning (ML) workloads have driven the design of ML accelerators aiming at an optimal tradeoff between efficiency and flexibility.
Scalable Software Testing in Fast Virtual Platforms: Leveraging SystemC, QEMU and ContainerizationLukas Jünger, Jan Henrik Weinstock, Tim Kraus2025-06-12下载The ever-increasing complexity of HW/SW systems presents a persistent challenge, particularly in safety-critical domains like automotive, where extensive testing is imperative.
EasyDRAM: An FPGA-based Infrastructure for Fast and Accurate End-to-End Evaluation of Emerging DRAM TechniquesOğuzhan Canpolat, Ataberk Olgun, David Novo, Oğuz Ergin, Onur Mutlu2025-06-12下载DRAM is a critical component of modern computing systems. Recent works propose numerous techniques (that we call DRAM techniques) to enhance DRAM-based computing systems' throughput, reliability, and ...
CarbonSet: A Dataset to Analyze Trends and Benchmark the Sustainability of CPUs and GPUsJiajun Hu, Chetan Choppali Sudarshan, Vidya A. Chhabria, Aman Arora2025-06-12下载Over the years, the chip industry has consistently developed high-performance processors to address the increasing demands across diverse applications.
FloorPlan-DeepSeek (FPDS): A multimodal approach to floorplan generation using vector-based next room predictionJun Yin, Pengyu Zeng, Jing Zhong, Peilin Li, Miao Zhang, Ran Luo, Shuai Lu2025-06-12下载In the architectural design process, floor plan generation is inherently progressive and iterative. However, existing generative models for floor plans are predominantly end-to-end generation that pro...
Synchronization for Fault-Tolerant Quantum ComputersSatvik Maurya, Swamit Tannu2025-06-12下载Quantum Error Correction (QEC) codes store information reliably in logical qubits by encoding them in a larger number of less reliable qubits.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
SwiftSpec: Ultra-Low Latency LLM Decoding by Scaling Asynchronous Speculative DecodingZiyi Zhang, Ziheng Jiang, Chengquan Jiang, Menghan Yu, Size Zheng, Haibin Lin, Henry Hoffmann, Xin Liu2025-06-12下载Low-latency decoding for large language models (LLMs) is crucial for applications like chatbots and code assistants, yet generating long outputs remains slow in single-query settings.
Adaptive Job Scheduling in Quantum Clouds Using Reinforcement LearningWaylon Luo, Jiapeng Zhao, Tong Zhan, Qiang Guan2025-06-12下载Present-day quantum systems face critical bottlenecks, including limited qubit counts, brief coherence intervals, and high susceptibility to errors-all of which obstruct the execution of large and com...
The Impact of Partial Computations on the Red-Blue Pebble GamePál András Papp, Aleksandros Sobczyk, A. N. Yzelman2025-06-12下载We study an extension of the well-known red-blue pebble game (RBP) with partial computation steps, inspired by the recent work of Sobczyk. While the original RBP assumes that we need to have all the i...
Faster CONGEST Approximation Algorithms for Maximum Weighted Independent Set in Sparse GraphsSalwa Faour, Fabian Kuhn2025-06-12下载The maximum independent set problem is a classic optimization problem that has also been studied quite intensively in the distributed setting.
Towards Sustainable Computing: Exploring Energy Consumption Efficiency of Alternative Configurations and Workloads in an Open Source Messaging SystemMaria Voreakou, George Kousiouris, Mara Nikolaidou2025-06-12下载Energy consumption in current large scale computing infrastructures is becoming a critical issue, especially with the growing demand for centralized systems such as cloud environments.
Deployment of Containerized Simulations in an API-Driven Distributed InfrastructureTim Kraus, Axel Sauer, Ingo Feldner2025-06-12下载The increasingly dynamic market for embedded systems makes virtual prototypes an indispensable tool for hardware/software codesign. The broad acceptance of the methodology has led to a diverse range o...
Graph-based Gossiping for Communication Efficiency in Decentralized Federated LearningHuong Nguyen, Hong-Tri Nguyen, Praveen Kumar Donta, Susanna Pirttikangas, Lauri Lovén2025-06-12下载Federated learning has emerged as a privacy-preserving technique for collaborative model training across heterogeneously distributed silos. Yet, its reliance on a single central server introduces pote...
Model Discovery and Graph Simulation: A Lightweight Gateway to Chaos EngineeringAnatoly A. Krasnovsky2025-06-12下载Chaos engineering reveals resilience risks but is expensive and operationally risky to run broadly and often. Model-based analyses can estimate dependability, yet in practice they are tricky to build ...
6G Infrastructures for Edge AI: An Analytical PerspectiveKurt Horvath, Shpresa Tuda, Blerta Idrizi, Stojan Kitanov, Fisnik Doko, Dragi Kimovski2025-06-12下载The convergence of Artificial Intelligence (AI) and the Internet of Things has accelerated the development of distributed, network-sensitive applications, necessitating ultra-low latency, high through...
GPU-Accelerated Distributed QAOA on Large-scale HPC EcosystemsZhihao Xu, Srikar Chundury, Seongmin Kim, Amir Shehata, Xinyi Li, Ang Li, Tengfei Luo, Frank Mueller, In-Saeng Suh2025-06-12下载Quantum computing holds great potential to accelerate the process of solving complex combinatorial optimization problems. The Distributed Quantum Approximate Optimization Algorithm (DQAOA) addresses h...
HP2C-DT: High-Precision High-Performance Computer-enabled Digital TwinE. Iraola, M. García-Lorenzo, F. Lordan-Gomis, F. Rossi, E. Prieto-Araujo, R. M. Badia2025-06-12下载Digital twins are transforming the way we monitor, analyze, and control physical systems, but designing architectures that balance real-time responsiveness with heavy computational demands remains a c...
TD-Pipe: Temporally-Disaggregated Pipeline Parallelism Architecture for High-Throughput LLM InferenceHongbin Zhang, Taosheng Wei, Zhenyi Zheng, Jiangsu Du, Zhiguang Chen, Yutong Lu2025-06-12下载As the model size continuously increases, pipeline parallelism shows great promise in throughput-oriented LLM inference due to its low demand on communications.
Automating Multi-Tenancy Performance Evaluation on Edge Compute NodesJoanna Georgiou, Moysis Symeonides, George Pallis, Marios D. Dikaiakos2025-06-12下载Edge Computing emerges as a promising alternative of Cloud Computing, with scalable compute resources and services deployed in the path between IoT devices and Cloud.
A Hybrid Heuristic Framework for Resource-Efficient Querying of Scientific Experiments DataMayank Patel, Minal Bhise2025-06-12下载Scientific experiments and modern applications are generating large amounts of data every day. Most organizations utilize In-house servers or Cloud resources to manage application data and workload.
Multi-dimensional Autoscaling of Processing Services: A Comparison of Agent-based MethodsBoris Sedlak, Alireza Furutanpey, Zihang Wang, Víctor Casamayor Pujol, Schahram Dustdar2025-06-12下载Edge computing breaks with traditional autoscaling due to strict resource constraints, thus, motivating more flexible scaling behaviors using multiple elasticity dimensions.
Federated Learning within Global Energy Budget over Heterogeneous Edge AcceleratorsRoopkatha Banerjee, Tejus Chandrashekar, Ananth Eswar, Yogesh Simmhan2025-06-12下载Federated Learning (FL) enables collaborative model training across distributed clients while preserving data privacy. However, optimizing both energy efficiency and model accuracy remains a challenge...
HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary ExplorationJiaqi Lv, Xufeng He, Yanchen Liu, Xu Dai, Aocheng Shen, Yinghao Li, Jiachen Hao, Jianrong Ding, Yang Hu, Shouyi Yin2025-06-12下载The rapid growth of deep learning has driven exponential increases in model parameters and computational demands. NVIDIA GPUs and their CUDA-based software ecosystem provide robust support for paralle...
Bug Classification in Quantum Software: A Rule-Based Framework and Its EvaluationMir Mohammad Yousuf, Shabir Ahmad Sofi2025-06-12下载Accurate classification of software bugs is essential for improving software quality. This paper presents a rule-based automated framework for classifying issues in quantum software repositories by bu...
Is Sparse Matrix Reordering Effective for Sparse Matrix-Vector Multiplication?Omid Asudeh, Sina Mahdipour Saravani, Gerald Sabin, Fabrice Rastello, P Sadayappan2025-06-12下载This work evaluates the impact of sparse matrix reordering on the performance of sparse matrix-vector multiplication across different multicore CPU platforms.
Resilience through Automated Adaptive Configuration for Distribution and ReplicationScott D. Stoller, Balaji Jayasankar, Yanhong A. Liu2025-06-12下载This paper presents a powerful automated framework for making complex systems resilient under failures, by optimized adaptive distribution and replication of interdependent software components across ...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Hardware-Aware Neural Architecture Search for Encrypted Traffic Classification on Resource-Constrained DevicesAdel Chehade, Edoardo Ragusa, Paolo Gastaldo, Rodolfo Zunino2025-06-12下载This paper presents a hardware-efficient deep neural network (DNN), optimized through hardware-aware neural architecture search (HW-NAS); the DNN supports the classification of session-level encrypted...
Jelly: a Fast and Convenient RDF Serialization FormatPiotr Sowinski, Karolina Bogacka, Anastasiya Danilenka, Nikita Kozlov2025-06-12下载Existing RDF serialization formats such as Turtle, N-Quads, and JSON-LD are widely used for communication and storage in knowledge graph and Semantic Web applications.
Decentralized Uplink Adaptive Compression for Cell-Free MIMO with Limited FronthaulZehua Li, Jingjie Wei, Raviraj Adve2025-06-12下载We study the problem of uplink compression for cell-free multi-input multi-output networks with limited fronthaul capacity. In compress-forward mode, remote radio heads (RRHs) compress the received si...
Agentic Semantic Control for Autonomous Wireless Space Networks: Extending Space-O-RAN with MCP-Driven Distributed IntelligenceEduardo Baena, Paolo Testolina, Michele Polese, Sergi Aliaga, Andrew Benincasa, Dimitrios Koutsonikolas, Josep Jornet, Tommaso Melodia2025-06-12下载Lunar surface operations impose stringent requirements on wireless communication systems, including autonomy, robustness to disruption, and the ability to adapt to environmental and mission-driven con...
Dynamic Beyond 5G and 6G Connectivity: Leveraging NTN and RIS Synergies for Optimized Coverage and Capacity in High-Density EnvironmentsValdemar Farré, Juan Estrada, David Vega, Luis F Urquiza-Aguiar, Juan A. Vásquez Peralvo, Symeon Chatzinotas2025-06-12下载The increasing demand for reliable, high-capacity communication during large-scale outdoor events poses significant challenges for traditional Terrestrial Networks (TNs), which often struggle to provi...
Energy-Efficient Deep Learning for Traffic Classification on MicrocontrollersAdel Chehade, Edoardo Ragusa, Paolo Gastaldo, Rodolfo Zunino2025-06-12下载In this paper, we present a practical deep learning (DL) approach for energy-efficient traffic classification (TC) on resource-limited microcontrollers, which are widely used in IoT-based smart system...
Large Language Models-Empowered Wireless Networks: Fundamentals, Architecture, and ChallengesLatif U. Khan, Maher Guizani, Sami Muhaidat, Choong Seon Hong2025-06-12下载The rapid advancement of wireless networks has resulted in numerous challenges stemming from their extensive demands for quality of service towards innovative quality of experience metrics (e.g.

cs.PF - Performance

标题作者发布日期PDF摘要
The Gittins Index: A Design Principle for Decision-Making Under UncertaintyZiv Scully, Alexander Terenin2025-06-12下载The Gittins index is a tool that optimally solves a variety of decision-making problems involving uncertainty, including multi-armed bandit problems, minimizing mean latency in queues, and search prob...
LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMsYanan Cai, Ahmed Salem, Besmira Nushi, Mark Russinovich2025-06-12下载We introduce LogiPlan, a novel benchmark designed to evaluate the capabilities of large language models (LLMs) in logical planning and reasoning over complex relational structures.
A Hybrid Heuristic Framework for Resource-Efficient Querying of Scientific Experiments DataMayank Patel, Minal Bhise2025-06-12下载Scientific experiments and modern applications are generating large amounts of data every day. Most organizations utilize In-house servers or Cloud resources to manage application data and workload.
Is Sparse Matrix Reordering Effective for Sparse Matrix-Vector Multiplication?Omid Asudeh, Sina Mahdipour Saravani, Gerald Sabin, Fabrice Rastello, P Sadayappan2025-06-12下载This work evaluates the impact of sparse matrix reordering on the performance of sparse matrix-vector multiplication across different multicore CPU platforms.

基于 VitePress 构建