Skip to content

2025-02-28

cs.AR - Architecture

标题作者发布日期PDF摘要
AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit TopologiesJian Gao, Weidong Cao, Junyi Yang, Xuan Zhang2025-02-28下载The massive and large-scale design of foundational semiconductor integrated circuits (ICs) is crucial to sustaining the advancement of many emerging and future technologies, such as generative AI, 5G/...
AMuLeT: Automated Design-Time Testing of Secure Speculation CountermeasuresBo Fu, Leo Tenenbaum, David Adler, Assaf Klein, Arpit Gogia, Alaa R. Alameldeen, Marco Guarnieri, Mark Silberstein, Oleksii Oleksenko, Gururaj Saileshwar2025-02-28下载In recent years, several hardware-based countermeasures proposed to mitigate Spectre attacks have been shown to be insecure. To enable the development of effective secure speculation countermeasures, ...
Optimizing and Exploring System Performance in Compact Processing-in-Memory-based ChipsPeilin Chen, Xiaoxuan Yang2025-02-28下载Processing-in-memory (PIM) is a promising computing paradigm to tackle the "memory wall" challenge. However, PIM system-level benefits over traditional von Neumann architecture can be reduced when the...
Recurrent CircuitSAT Sampling for Sequential CircuitsArash Ardakani, Kevin He, John Wawrzynek2025-02-28下载In this work, we introduce a novel GPU-accelerated circuit satisfiability (CircuitSAT) sampling technique for sequential circuits. This work is motivated by the requirement in constrained random verif...
AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural NetworksPedro Gimenes, Yiren Zhao, George Constantinides2025-02-28下载Graph Neural Networks (GNNs) have recently gained attention due to their performance on non-Euclidean data. The use of custom hardware architectures proves particularly beneficial for GNNs due to thei...
A RISC-V Multicore and GPU SoC Platform with a Qualifiable Software Stack for Safety Critical SystemsMarc Solé i Bonet, Jannis Wolf, Leonidas Kosmidis2025-02-28下载In the context of the Horizon Europe project, METASAT, a hardware platform was developed as a prototype of future space systems. The platform is based on a multiprocessor NOEL-V, an established space-...
LarQucut: A New Cutting and Mapping Approach for Large-sized Quantum Circuits in Distributed Quantum Computing (DQC) EnvironmentsXinglei Dou, Lei Liu, Zhuohao Wang, Pengyu Li2025-02-28下载Distributed quantum computing (DQC) is a promising way to achieve large-scale quantum computing. However, mapping large-sized quantum circuits in DQC is a challenging job; for example, it is difficult...
Timing-Driven Global Placement by Efficient Critical Path ExtractionYunqi Shi, Siyuan Xu, Shixiong Kai, Xi Lin, Ke Xue, Mingxuan Yuan, Chao Qian2025-02-28下载Timing optimization during the global placement of integrated circuits has been a significant focus for decades, yet it remains a complex, unresolved issue.
On the Impact of Intra-node Communication in the Performance of Supercomputer and Data Center Interconnection NetworksJoaquin Tarraga-Moreno, Jesus Escudero-Sahuquillo, Pedro Javier Garcia, Francisco J. Quiles2025-02-28下载In the last decade, specific-purpose computing and storage devices, such as GPUs, TPUs, or high-speed storage, have been incorporated into server nodes of Supercomputers and Data centers.
Enhancing software-hardware co-design for HEP by low-overhead profiling of single- and multi-threaded programs on diverse architectures with AdaptystMaksymilian Graczyk, Stefan Roiser2025-02-28下载Given the recent technological trends and novel computing paradigms spanning both software and hardware, physicists and software developers can no longer just rely on computers becoming faster to meet...
R4R^4: A Racetrack Register File with Runtime Software ReconfigurationChristian Hakert, Shuo-Han Chen, Kay Heider, Roland Kühn, Yun-Chih Chen, Jens Teubner, Jian-Jia Chen2025-02-28下载Arising disruptive memory technologies continuously make their way into the memory hierarchy at various levels. Racetrack memory is one promising candidate for future memory due to the overall low ene...

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Distributed Variational Quantum Algorithm with Many-qubit for Optimization ChallengesSeongmin Kim, In-Saeng Suh2025-02-28下载Optimization problems are critical across various domains, yet existing quantum algorithms, despite their great potential, struggle with scalability and accuracy due to excessive reliance on entanglem...
Elastic Restaking NetworksRoi Bar-Zur, Ittay Eyal2025-02-28下载Many blockchain-based decentralized services require their validators (operators) to deposit stake (collateral), which is forfeited (slashed) if they misbehave.
Supporting the development of Machine Learning for fundamental science in a federated Cloud with the AI_INFN platformLucio Anderlini, Matteo Barbetti, Giulio Bianchini, Diego Ciangottini, Stefano Dal Pra, Diego Michelotto, Carmelo Pellegrino, Rosa Petrini, Alessandro Pascolini, Daniele Spiga2025-02-28下载Machine Learning (ML) is driving a revolution in the way scientists design, develop, and deploy data-intensive software. However, the adoption of ML presents new challenges for the computing infrastru...
A quantum walk inspired model for distributed computing on arbitrary graphsMathieu Roget, Giuseppe Di Molfetta2025-02-28下载A discrete time quantum walk is known to be the single-particle sector of a quantum cellular automaton. For a long time, these models have interested the community for their nice properties such as lo...
ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUsHao Ge, Junda Feng, Qi Huang, Fangcheng Fu, Xiaonan Nie, Lei Zuo, Haibin Lin, Bin Cui, Xin Liu2025-02-28下载Scaling long-context ability is essential for Large Language Models (LLMs). To amortize the memory consumption across multiple devices in long-context training, inter-data partitioning (a.k.a.
Flora: Efficient Cloud Resource Selection for Big Data Processing via Job ClassificationJonathan Will, Lauritz Thamsen, Jonathan Bader, Odej Kao2025-02-28下载Distributed dataflow systems like Spark and Flink enable data-parallel processing of large datasets on clusters of cloud resources. Yet, selecting appropriate computational resources for dataflow jobs...
When MIS and Maximal Matching are Easy in the Congested CliqueKeren Censor-Hillel, Tomer Even, Maxime Flin, Magnús M. Halldórsson2025-02-28下载Two of the most fundamental distributed symmetry-breaking problems are that of finding a maximal independent set (MIS) and a maximal matching (MM) in a graph.
FedDyMem: Efficient Federated Learning with Dynamic Memory and Memory-Reduce for Unsupervised Image Anomaly DetectionSilin Chen, Andy Liu, Kangjian Di, Yichu Xu, Han-Jia Ye, Wenhan Luo, Ningmu Zou2025-02-28下载Unsupervised image anomaly detection (UAD) has become a critical process in industrial and medical applications, but it faces growing challenges due to increasing concerns over data privacy.
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead RetrievalChien-Yu Lin, Keisuke Kamahori, Yiyu Liu, Xiaoxiang Shi, Madhav Kashyap, Yile Gu, Rulin Shao, Zihao Ye, Kan Zhu, Rohan Kadekodi, Stephanie Wang, Arvind Krishnamurthy, Luis Ceze, Baris Kasikci2025-02-28下载Retrieval-augmented generation (RAG) extends large language models (LLMs) with external data sources to enhance factual correctness and domain coverage.
Cicada: A Pipeline-Efficient Approach to Serverless Inference with Decoupled ManagementZ. Wu, Y. Deng, J. Hu, L. Cui, Z. Zhang, L. Zeng, G. Min2025-02-28下载Serverless computing has emerged as a pivotal paradigm for deploying Deep Learning (DL) models, offering automatic scaling and cost efficiency.
Managing Federated Learning on Decentralized Infrastructures as a Reputation-based Collaborative WorkflowYuandou Wang, Zhiming Zhao2025-02-28下载Federated Learning (FL) has recently emerged as a collaborative learning paradigm that can train a global model among distributed participants without raw data exchange to satisfy varying requirements...
AARC: Automated Affinity-aware Resource Configuration for Serverless WorkflowsLingxiao Jin, Zinuo Cai, Zebin Chen, Hongyu Zhao, Ruhui Ma2025-02-28下载Serverless computing is increasingly adopted for its ability to manage complex, event-driven workloads without the need for infrastructure provisioning.
LADs: Leveraging LLMs for AI-Driven DevOpsAhmad Faraz Khan, Azal Ahmad Khan, Anas Mohamed, Haider Ali, Suchithra Moolinti, Sabaat Haroon, Usman Tahir, Mattia Fazzini, Ali R. Butt, Ali Anwar2025-02-28下载Automating cloud configuration and deployment remains a critical challenge due to evolving infrastructures, heterogeneous hardware, and fluctuating workloads.
SkyStore: Cost-Optimized Object Storage Across Regions and CloudsShu Liu, Xiangxi Mo, Moshik Hershcovitch, Henric Zhang, Audrey Cheng, Guy Girmonsky, Gil Vernik, Michael Factor, Tiemo Bang, Soujanya Ponnapalli, Natacha Crooks, Joseph E. Gonzalez, Danny Harnik, Ion Stoica2025-02-28下载Modern applications span multiple clouds to reduce costs, avoid vendor lock-in, and leverage low-availability resources in another cloud. However, standard object stores operate within a single cloud,...
SPD: Sync-Point Drop for Efficient Tensor Parallelism of Large Language ModelsHan-Byul Kim, Duc Hoang, Arnav Kundu, Mohammad Samragh, Minsik Cho2025-02-28下载With the rapid expansion in the scale of large language models (LLMs), enabling efficient distributed inference across multiple computing units has become increasingly critical.
Deep RC: A Scalable Data Engineering and Deep Learning PipelineArup Kumar Sarker, Aymen Alsaadi, Alexander James Halpern, Prabhath Tangella, Mikhail Titov, Niranda Perera, Mills Staylor, Gregor von Laszewski, Shantenu Jha, Geoffrey Fox2025-02-28下载Significant obstacles exist in scientific domains including genetics, climate modeling, and astronomy due to the management, preprocess, and training on complicated data for deep learning.
MonadBFT: Fast, Responsive, Fork-Resistant Streamlined ConsensusMohammad Mussadiq Jalalzai, Kushal Babel, Jovan Komatovic, Tobias Klenze, Sourav Das, Fatima Elsheimy, Mike Setrin, John Bergschneider, Babak Gilkalaye2025-02-28下载This paper introduces MonadBFT, a novel Byzantine Fault Tolerant (BFT) consensus protocol that advances both performance and robustness. MonadBFT is implemented as the consensus protocol in the Monad ...

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Fed-KAN: Federated Learning with Kolmogorov-Arnold Networks for Traffic PredictionEngin Zeydan, Cristian J. Vaca-Rubio, Luis Blanco, Roberto Pereira, Marius Caus, Kapal Dev2025-02-28下载Non-Terrestrial Networks (NTNs) are becoming a critical component of modern communication infrastructures, especially with the advent of Low Earth Orbit (LEO) satellite systems.
Enabling AutoML for Zero-Touch Network Security: Use-Case Driven AnalysisLi Yang, Mirna El Rajab, Abdallah Shami, Sami Muhaidat2025-02-28下载Zero-Touch Networks (ZTNs) represent a state-of-the-art paradigm shift towards fully automated and intelligent network management, enabling the automation and intelligence required to manage the compl...
Channel, Mode and Power Optimization for non-Orthogonal D2D Communications: a Hybrid ApproachFederico Librino, Giorgio Quer2025-02-28下载The increasing traffic demand in cellular networks has recently led to the investigation of new strategies to save precious resources like spectrum and energy.
The Complexity-Performance Tradeoff in Resource Allocation for URLLC Exploiting Dynamic CSIFederico Librino, Paolo Santi2025-02-28下载The challenging applications envisioned for the future Internet of Things networks are making it urgent to develop fast and scalable resource allocation algorithms able to meet the stringent reliabili...
Distributed Data Access in Industrial Edge NetworksTheofanis P. Raptis, Andrea Passarella, Marco Conti2025-02-28下载Wireless edge networks in smart industrial environments increasingly operate using advanced sensors and autonomous machines interacting with each other and generating huge amounts of data.
Resource Allocation and Sharing in URLLC for IoT Applications using Shareability GraphsFederico Librino, Paolo Santi2025-02-28下载The current development trend of wireless communications aims at coping with the very stringent reliability and latency requirements posed by several emerging Internet of Things (IoT) application scen...
Including Follower Dynamics in Beaconing for Platooning SafetyHassan Laghbi, Nigel Thomas2025-02-28下载In this paper, we propose procedures to address platoon follower dynamics within adaptive beaconing. We implement them in a known adaptive beaconing scheme which is Jerk Beaconing (JB) to improve its ...
Towards Specialized Wireless Networks Using an ML-Driven Radio InterfaceKamil Szczech, Maksymilian Wojnar, Katarzyna Kosek-Szott, Krzysztof Rusek, Szymon Szott, Dileepa Marasinghe, Nandana Rajatheva, Richard Combes, Francesc Wilhelmi, Anders Jonsson, Boris Bellalta2025-02-28下载Future wireless networks will need to support diverse applications (such as extended reality), scenarios (such as fully automated industries), and technological advances (such as terahertz communicati...
The Effect of Hop-count Modification Attack on Random Walk-based SLP Schemes Developed forWSNs: a StudyManjula Rajaa, Anirban Ghoshb, Chukkapalli Praveen Kumarc, Suleiman Samba, C N Shariff2025-02-28下载Source location privacy (SLP) has been of great concern in WSNs when deployed for habitat monitoring applications. The issue is taken care of by employing privacy-preserving routing schemes.
Towards Zero Touch Networks: Cross-Layer Automated Security Solutions for 6G Wireless NetworksLi Yang, Shimaa Naser, Abdallah Shami, Sami Muhaidat, Lyndon Ong, Mérouane Debbah2025-02-28下载The transition from 5G to 6G mobile networks necessitates network automation to meet the escalating demands for high data rates, ultra-low latency, and integrated technology.

cs.PF - Performance

标题作者发布日期PDF摘要
Recurrent CircuitSAT Sampling for Sequential CircuitsArash Ardakani, Kevin He, John Wawrzynek2025-02-28下载In this work, we introduce a novel GPU-accelerated circuit satisfiability (CircuitSAT) sampling technique for sequential circuits. This work is motivated by the requirement in constrained random verif...
Enhancing software-hardware co-design for HEP by low-overhead profiling of single- and multi-threaded programs on diverse architectures with AdaptystMaksymilian Graczyk, Stefan Roiser2025-02-28下载Given the recent technological trends and novel computing paradigms spanning both software and hardware, physicists and software developers can no longer just rely on computers becoming faster to meet...
AARC: Automated Affinity-aware Resource Configuration for Serverless WorkflowsLingxiao Jin, Zinuo Cai, Zebin Chen, Hongyu Zhao, Ruhui Ma2025-02-28下载Serverless computing is increasingly adopted for its ability to manage complex, event-driven workloads without the need for infrastructure provisioning.

基于 VitePress 构建