2025-07-14

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
OpenGCRAM: An Open-Source Gain Cell Compiler Enabling Design-Space Exploration for AI Workloads	Xinxin Wang, Lixian Yan, Shuhan Liu, Luke Upton, Zhuoqi Cai, Yiming Tan, Shengman Li, Koustav Jana, Peijing Li, Jesse Cirimelli-Low, Thierry Tambe, Matthew Guthaus, H. -S. Philip Wong	2025-07-14	下载	Gain Cell memory (GCRAM) offers higher density and lower power than SRAM, making it a promising candidate for on-chip memory in domain-specific accelerators.
LASANA: Large-Scale Surrogate Modeling for Analog Neuromorphic Architecture Exploration	Jason Ho, James A. Boyle, Linshen Liu, Andreas Gerstlauer	2025-07-14	下载	Neuromorphic systems using in-memory or event-driven computing are motivated by the need for more energy-efficient processing of artificial intelligence workloads.
Solving the compute crisis with physics-based ASICs	Maxwell Aifer, Zach Belateche, Suraj Bramhavar, Kerem Y. Camsari, Patrick J. Coles, Gavin Crooks, Douglas J. Durian, Andrea J. Liu, Anastasia Marchenkova, Antonio J. Martinez, Peter L. McMahon, Faris Sbahi, Benjamin Weiner, Logan G. Wright	2025-07-14	下载	Escalating artificial intelligence (AI) demands expose a critical "compute crisis" characterized by unsustainable energy consumption, prohibitive training costs, and the approaching limits of conventi...
AssertCoder: LLM-Based Assertion Generation via Multimodal Specification Extraction	Enyuan Tian, Yiwei Ci, Qiusong Yang, Yufeng Li, Zhichao Lyu	2025-07-14	下载	Assertion-Based Verification (ABV) is critical for ensuring functional correctness in modern hardware systems. However, manually writing high-quality SVAs remains labor-intensive and error-prone.
Evaluating LLM-based Workflows for Switched-Mode Power Supply Design	Simon Nau, Jan Krummenauer, André Zimmermann	2025-07-14	下载	Large language models (LLMs) have great potential to enhance productivity in many disciplines, such as software engineering. However, it is unclear to what extent they can assist in the design process...
Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving	Wonung Kim, Yubin Lee, Yoonsung Kim, Jinwoo Hwang, Seongryong Oh, Jiyong Jung, Aziz Huseynov, Woong Gyu Park, Chang Hyun Park, Divya Mahajan, Jongse Park	2025-07-14	下载	Transformers are the driving force behind today's Large Language Models (LLMs), serving as the foundation for their performance and versatility.
AnalogTester: A Large Language Model-Based Framework for Automatic Testbench Generation in Analog Circuit Design	Weiyu Chen, Chengjie Liu, Wenhao Huang, Jinyang Lyu, Mingqian Yang, Yuan Du, Li Du, Jun Yang	2025-07-14	下载	Recent advancements have demonstrated the significant potential of large language models (LLMs) in analog circuit design. Nevertheless, testbench construction for analog circuits remains manual, creat...
Iceberg: Enhancing HLS Modeling with Synthetic Data	Zijian Ding, Tung Nguyen, Weikai Li, Aditya Grover, Yizhou Sun, Jason Cong	2025-07-14	下载	Deep learning-based prediction models for High-Level Synthesis (HLS) of hardware designs often struggle to generalize. In this paper, we study how to close the generalizability gap of these models thr...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
Stream programs are monoid homomorphisms with state	Tyler Hou, Michael Arntzenius, Max Willsey	2025-07-14	下载	We define a broad class of deterministic stream functions and show they can be implemented as homomorphisms into a "state" monoid. The homomorphism laws are simpler than the conditions of previous sem...
Dissecting the NVIDIA Blackwell Architecture with Microbenchmarks	Aaron Jarmusch, Nathan Graddon, Sunita Chandrasekaran	2025-07-14	下载	The rapid development in scientific research provides a need for more compute power, which is partly being solved by GPUs. This paper presents a microarchitectural analysis of the modern NVIDIA Blackw...
FAFO: Over 1 million TPS on a single node running EVM while still Merkleizing every block	Ryan Zarick, Isaac Zhang, Daniel Wong, Thomas Kim, Bryan Pellegrino, Mignon Li, Kelvin Wong	2025-07-14	下载	Current blockchain execution throughput is limited by data contention, reducing execution layer parallelism. Fast Ahead-of-Formation Optimization (FAFO) is the first blockchain transaction scheduler t...
Access Control for Information-Theoretically Secure Key-Document Stores	Yin Li, Sharad Mehrota, Shantanu Sharma, Komal Kumari	2025-07-14	下载	This paper presents a novel key-based access control technique for secure outsourcing key-value stores where values correspond to documents that are indexed and accessed using keys.
Environmentally-Conscious Cloud Orchestration Considering Geo-Distributed Data Centers	Giulio Attenni, Novella Bartolini	2025-07-14	下载	This paper presents a theoretical discussion for environmentally-conscious job deployment and migration in cloud environments, aiming to minimize the environmental impact of resource provisioning whil...
Efficient Federated Learning with Heterogeneous Data and Adaptive Dropout	Ji Liu, Beichen Ma, Qiaolin Yu, Ruoming Jin, Jingbo Zhou, Yang Zhou, Huaiyu Dai, Haixun Wang, Dejing Dou, Patrick Valduriez	2025-07-14	下载	Federated Learning (FL) is a promising distributed machine learning approach that enables collaborative training of a global model using multiple edge devices.
Consensus, Inconsistency, Emergence: what's paraconsistency got to do with it?	Gabriel Rocha	2025-07-14	下载	The consensus problem, briefly stated, consists of having processes in an asynchronous distributed system agree on a value. It is widely known that the consensus problem does not have a deterministic ...
Zorse: Optimizing LLM Training Efficiency on Heterogeneous GPU Clusters	Runsheng Benson Guo, Utkarsh Anand, Khuzaima Daudjee, Rathijit Sen	2025-07-14	下载	Large language models (LLMs) require vast amounts of GPU compute to train, but limited availability and high costs of GPUs make homogeneous clusters impractical for many organizations.
FalconFS: Distributed File System for Large-Scale Deep Learning Pipeline	Jingwei Xu, Junbin Kang, Mingkai Dong, Mingyu Liu, Lu Zhang, Shaohong Guo, Ziyan Qiu, Mingzhen You, Ziyi Tian, Anqi Yu, Tianhong Ding, Xinwei Hu, Haibo Chen	2025-07-14	下载	Client-side metadata caching has long been considered an effective method for accelerating metadata operations in distributed file systems (DFSs). However, we have found that client-side state (e.g.
Convergence of Agnostic Federated Averaging	Herlock, Rahimi, Dionysis Kalogerias	2025-07-14	下载	Federated learning (FL) enables decentralized model training without centralizing raw data. However, practical FL deployments often face a key realistic challenge: Clients participate intermittently i...
Temporal-Aware GPU Resource Allocation for Distributed LLM Inference via Reinforcement Learning	Chengze Du, Zhiwei Yu, Heng Xu, Haojie Wang, Bo liu, Jialong Li	2025-07-14	下载	The rapid growth of large language model (LLM) services imposes increasing demands on distributed GPU inference infrastructure. Most existing scheduling systems follow a reactive paradigm, relying sol...
Domain Borders Are There to Be Crossed With Federated Few-Shot Adaptation	Manuel Röder, Christoph Raab, Frank-Michael Schleif	2025-07-14	下载	Federated Learning has emerged as a leading paradigm for decentralized, privacy-preserving learning, particularly relevant in the era of interconnected edge devices equipped with sensors.
Past-Future Scheduler for LLM Serving under SLA Guarantees	Ruihao Gong, Shihao Bai, Siyu Wu, Yunqian Fan, Zaijun Wang, Xiuhong Li, Hailong Yang, Xianglong Liu	2025-07-14	下载	The exploration and application of Large Language Models (LLMs) is thriving. To reduce deployment costs, continuous batching has become an essential feature in current service frameworks.
Large-Scale Graph Building in Dynamic Environments: Low Latency and High Quality	Filipe Miguel Gonçalves de Almeida, CJ Carey, Hendrik Fichtenberger, Jonathan Halcrow, Silvio Lattanzi, André Linhares, Tao Meng, Ashkan Norouzi-Fard, Nikos Parotsidis, Bryan Perozzi, David Simcha	2025-07-14	下载	Learning and constructing large-scale graphs has attracted attention in recent decades, resulting in a rich literature that introduced various systems, tools, and algorithms.
A Model Aware AIGC Task Offloading Algorithm in IIoT Edge Computing	Xin Wang, Xiao Huan Li, Xun Wang	2025-07-14	下载	The integration of the Industrial Internet of Things (IIoT) with Artificial Intelligence-Generated Content (AIGC) offers new opportunities for smart manufacturing, but it also introduces challenges re...
ElasticMM: Efficient Multimodal LLMs Serving with Elastic Multimodal Parallelism	Zedong Liu, Shenggan Cheng, Guangming Tan, Yang You, Dingwen Tao	2025-07-14	下载	Multimodal large language models (MLLMs) extend LLMs to handle images, videos, and audio by incorporating feature extractors and projection modules.
EAT: QoS-Aware Edge-Collaborative AIGC Task Scheduling via Attention-Guided Diffusion Reinforcement Learning	Zhifei Xu, Zhiqing Tang, Jiong Lou, Zhi Yao, Xuan Xie, Tian Wang, Yinglong Wang, Weijia Jia	2025-07-14	下载	The growth of Artificial Intelligence (AI) and large language models has enabled the use of Generative AI (GenAI) in cloud data centers for diverse AI-Generated Content (AIGC) tasks.
Green-LLM: Optimal Workload Allocation for Environmentally-Aware Distributed Inference	Jiaming Cheng, Duong Tung Nguyen	2025-07-14	下载	This letter investigates the optimal allocation of large language model (LLM) inference workloads across heterogeneous edge data centers (DCs) over time.
Intelligent Task Management via Dynamic Multi-region Division in LEO Satellite Networks	Zixuan Song, Zhishu Shen, Xiaoyu Zheng, Qiushi Zheng, Zheng Lei, Jiong Jin	2025-07-14	下载	As a key complement to terrestrial networks and a fundamental component of future 6G systems, Low Earth Orbit (LEO) satellite networks are expected to provide high-quality communication services when ...

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
A Fault-Tolerant Architecture for Urban and Rural Digital Connectivity: Synergizing SDWMN, Direct-to-Mobile Broadcasting, and Hybrid Cloud Streaming	Pavel Malinovskiy	2025-07-14	下载	We propose an integrated architecture combining Software-Defined Wireless Mesh Networks (SDWMN), Direct-to-Mobile (D2M) broadcasting, and Kafka-based hybrid cloud streaming to improve wireless network...
FAFO: Over 1 million TPS on a single node running EVM while still Merkleizing every block	Ryan Zarick, Isaac Zhang, Daniel Wong, Thomas Kim, Bryan Pellegrino, Mignon Li, Kelvin Wong	2025-07-14	下载	Current blockchain execution throughput is limited by data contention, reducing execution layer parallelism. Fast Ahead-of-Formation Optimization (FAFO) is the first blockchain transaction scheduler t...
Chat with AI: The Surprising Turn of Real-time Video Communication from Human to AI	Jiangkai Wu, Zhiyuan Ren, Liming Liu, Xinggong Zhang	2025-07-14	下载	AI Video Chat emerges as a new paradigm for Real-time Communication (RTC), where one peer is not a human, but a Multimodal Large Language Model (MLLM).
On Splitting Lightweight Semantic Image Segmentation for Wireless Communications	Ebrahim Abu-Helalah, Jordi Serra, Jordi Perez-Romero	2025-07-14	下载	Semantic communication represents a promising technique towards reducing communication costs, especially when dealing with image segmentation, but it still lacks a balance between computational effici...
DNS Tunneling: Threat Landscape and Improved Detection Solutions	Novruz Amirov, Baran Isik, Bilal Ihsan Tuncer, Serif Bahtiyar	2025-07-14	下载	Detecting Domain Name System (DNS) tunneling is a significant challenge in security due to its capacity to hide harmful actions within DNS traffic that appears to be normal and legitimate.
Temporal-Aware GPU Resource Allocation for Distributed LLM Inference via Reinforcement Learning	Chengze Du, Zhiwei Yu, Heng Xu, Haojie Wang, Bo liu, Jialong Li	2025-07-14	下载	The rapid growth of large language model (LLM) services imposes increasing demands on distributed GPU inference infrastructure. Most existing scheduling systems follow a reactive paradigm, relying sol...
Fine-Grained Coordinated OFDMA With Fiber Backhaul Enabled by openwifi and White Rabbit	Thijs Havinga, Xianjun Jiao, Wei Liu, Baiheng Chen, Robbe Gaeremynck, Ingrid Moerman	2025-07-14	下载	Proper coordination is needed to guarantee the performance of wireless networks in dense deployments. Contention-based systems suffer badly in terms of latency when multiple devices compete for the sa...
Green-LLM: Optimal Workload Allocation for Environmentally-Aware Distributed Inference	Jiaming Cheng, Duong Tung Nguyen	2025-07-14	下载	This letter investigates the optimal allocation of large language model (LLM) inference workloads across heterogeneous edge data centers (DCs) over time.
Endorsement-Driven Blockchain SSI Framework for Dynamic IoT Ecosystems	Guntur Dharma Putra, Bagus Rakadyanto Oktavianto Putra	2025-07-14	下载	Self-Sovereign Identity (SSI) offers significant potential for managing identities in the Internet of Things (IoT), enabling decentralized authentication and credential management without reliance on ...
UavNetSim-v1: A Python-based Simulation Platform for UAV Communication Networks	Zihao Zhou, Zipeng Dai, Linyi Huang, Cui Yang, Youjun Xiang, Jie Tang, Kai-kit Wong	2025-07-14	下载	In unmanned aerial vehicle (UAV) networks, communication protocols and algorithms are essential for cooperation and collaboration between UAVs.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
FalconFS: Distributed File System for Large-Scale Deep Learning Pipeline	Jingwei Xu, Junbin Kang, Mingkai Dong, Mingyu Liu, Lu Zhang, Shaohong Guo, Ziyan Qiu, Mingzhen You, Ziyi Tian, Anqi Yu, Tianhong Ding, Xinwei Hu, Haibo Chen	2025-07-14	下载	Client-side metadata caching has long been considered an effective method for accelerating metadata operations in distributed file systems (DFSs). However, we have found that client-side state (e.g.