2025-05-14

cs.AR - Architecture

标题	作者	发布日期	PDF	摘要
Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors	Nicolas Dupuis, Ravi Nair, Shyam Ramji, Sean McClintock, Nishant Chauhan, Priyanka Nagpal, Bart Blaner, Ken Valk, Leon Stok, Ruchir Puri	2025-05-14	下载	The use of Large Language Models (LLMs) in hardware design has taken off in recent years, principally through its incorporation in tools that increase chip designer productivity.
SEGA-DCIM: Design Space Exploration-Guided Automatic Digital CIM Compiler with Multiple Precision Support	Haikang Diao, Haoyi Zhang, Jiahao Song, Haoyang Luo, Yibo Lin, Runsheng Wang, Yuan Wang, Xiyuan Tang	2025-05-14	下载	Digital computing-in-memory (DCIM) has been a popular solution for addressing the memory wall problem in recent years. However, the DCIM design still heavily relies on manual efforts, and the optimiza...
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures	Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Huazuo Gao, Jiashi Li, Liyue Zhang, Panpan Huang, Shangyan Zhou, Shirong Ma, Wenfeng Liang, Ying He, Yuqing Wang, Yuxuan Liu, Y. X. Wei	2025-05-14	下载	The rapid scaling of large language models (LLMs) has unveiled critical limitations in current hardware architectures, including constraints in memory capacity, computational efficiency, and interconn...
Automated SAR ADC Sizing Using Analytical Equations	Zhongyi Li, Zhuofu Tao, Yanze Zhou, Yichen Shi, Zhiping Yu, Ting-Jung Lin, Lei He	2025-05-14	下载	Conventional analog and mixed-signal (AMS) circuit designs heavily rely on manual effort, which is time-consuming and labor-intensive. This paper presents a fully automated design methodology for Succ...

cs.DC - Distributed, Parallel, and Cluster Computing

标题	作者	发布日期	PDF	摘要
FAST: An Efficient Scheduler for All-to-All GPU Communication	Yiran Lei, Dongjoo Lee, Liangyu Zhao, Daniar Kurniawan, Chanmyeong Kim, Heetaek Jeong, Changsu Kim, Hyeonseong Choi, Liangcheng Yu, Arvind Krishnamurthy, Justine Sherry, Eriko Nurvitadhi	2025-05-14	下载	All-to-All(v) communication is a critical primitive in modern machine learning workloads, particularly mixture-of-experts (MoE) models. Unfortunately, efficient scheduling is challenging due to worklo...
MDTP -- An Adaptive Multi-Source Data Transfer Protocol	Sepideh Abdollah, Craig Partridge, Susmit Shannigrahi	2025-05-14	下载	Scientific data volume is growing in size, and as a direct result, the need for faster transfers is also increasing. The scientific community has sought to leverage parallel transfer methods using mul...
IoT-Enabled Hemodynamic Surveillance System: AD8232 Bioelectric Signal Processing with ESP32	Hemalatha R J, Shubham Malhotra, Shivapanchakshari T G, Lokesh K, Dev Anand D, Samson Jebakumar S	2025-05-14	下载	This dissertation proposes an electrocardiogram (ECG) tracking device that diagnoses cardiopulmonary problems using the Internet of Things (IoT) desired results.
ARM SVE Unleashed: Performance and Insights Across HPC Applications on Nvidia Grace	Ruimin Shi, Gabin Schieffer, Maya Gokhale, Pei-Hung Lin, Hiren Patel, Ivy Peng	2025-05-14	下载	Vector architectures are essential for boosting computing throughput. ARM provides SVE as the next-generation length-agnostic vector extension beyond traditional fixed-length SIMD.
Strategies to Measure Energy Consumption Using RAPL During Workflow Execution on Commodity Clusters	Philipp Thamm, Ulf Leser	2025-05-14	下载	In science, problems in many fields can be solved by processing datasets using a series of computationally expensive algorithms, sometimes referred to as workflows.
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures	Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Huazuo Gao, Jiashi Li, Liyue Zhang, Panpan Huang, Shangyan Zhou, Shirong Ma, Wenfeng Liang, Ying He, Yuqing Wang, Yuxuan Liu, Y. X. Wei	2025-05-14	下载	The rapid scaling of large language models (LLMs) has unveiled critical limitations in current hardware architectures, including constraints in memory capacity, computational efficiency, and interconn...
Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration	Zhonggen Li, Xiangyu Ke, Yifan Zhu, Yunjun Gao, Feifei Li	2025-05-14	下载	Graph embeddings map graph nodes to continuous vectors and are foundational to community detection, recommendation, and many scientific applications.
Birch SGD: A Tree Graph Framework for Local and Asynchronous SGD Methods	Alexander Tyurin, Danil Sivtsov	2025-05-14	下载	We propose a new unifying framework, Birch SGD, for analyzing and designing distributed SGD methods. The central idea is to represent each method as a weighted directed tree, referred to as a computat...
Towards Efficient Verification of Parallel Applications with Mc SimGrid	Matthieu Laurent, Thierry Jéron, Martin Quinson	2025-05-14	下载	Assessing the correctness of distributed and parallel applications is notoriously difficult due to the complexity of the concurrent behaviors and the difficulty to reproduce bugs.
ELIS: Efficient LLM Iterative Scheduling System with Response Length Predictor	Seungbeom Choi, Jeonghoe Goo, Eunjoo Jeon, Mingyu Yang, Minsung Jang	2025-05-14	下载	We propose ELIS, a serving system for Large Language Models (LLMs) featuring an Iterative Shortest Remaining Time First (ISRTF) scheduler designed to efficiently manage inference tasks with the shorte...
Toward Malicious Clients Detection in Federated Learning	Zhihao Dou, Jiaqi Wang, Wei Sun, Zhuqing Liu, Minghong Fang	2025-05-14	下载	Federated learning (FL) enables multiple clients to collaboratively train a global machine learning model without sharing their raw data. However, the decentralized nature of FL introduces vulnerabili...
Architecture of Tianyu Software: Relative Photometry as a Case Study	Yicheng Rui, Yifan Xuan, Shuyue Zheng, Kexin Li, Kaiming Cui, Kai Xiao, Jie Zheng, Jun Kai Ng, Hongxuan Jiang, Fabo Feng, Qinghui Sun	2025-05-14	下载	Tianyu telescope, an one-meter robotic optical survey instrument to be constructed in Lenghu, Qinghai, China, is designed for detecting transiting exoplanets, variable stars and transients.
The Adaptive Complexity of Finding a Stationary Point	Huanjian Zhou, Andi Han, Akiko Takeda, Masashi Sugiyama	2025-05-14	下载	In large-scale applications, such as machine learning, it is desirable to design non-convex optimization algorithms with a high degree of parallelization.

cs.NI - Networking and Internet Architecture

标题	作者	发布日期	PDF	摘要
FAST: An Efficient Scheduler for All-to-All GPU Communication	Yiran Lei, Dongjoo Lee, Liangyu Zhao, Daniar Kurniawan, Chanmyeong Kim, Heetaek Jeong, Changsu Kim, Hyeonseong Choi, Liangcheng Yu, Arvind Krishnamurthy, Justine Sherry, Eriko Nurvitadhi	2025-05-14	下载	All-to-All(v) communication is a critical primitive in modern machine learning workloads, particularly mixture-of-experts (MoE) models. Unfortunately, efficient scheduling is challenging due to worklo...
The Power of Alternatives in Network Embedding	Oleg Kolosov, Gala Yadgar, David Breitgand, Dean H. Lorenz	2025-05-14	下载	In the virtual network embedding problem, the goal is to map embed a set of virtual network instances to a given physical network substrate at minimal cost, while respecting the capacity constraints o...
MDTP -- An Adaptive Multi-Source Data Transfer Protocol	Sepideh Abdollah, Craig Partridge, Susmit Shannigrahi	2025-05-14	下载	Scientific data volume is growing in size, and as a direct result, the need for faster transfers is also increasing. The scientific community has sought to leverage parallel transfer methods using mul...
Dimensioning and Optimization of Reliability Coverage in Local 6G Networks	Jacek Kibiłda, Dian Echevarría Pérez, André Gomes, Onel L. Alcaraz López, Arthur S. de Sena, Nurul Huda Mahmood, Hirley Alves	2025-05-14	下载	Enabling vertical use cases for the sixth generation (6G) wireless networks, such as automated manufacturing, immersive extended reality (XR), and self-driving fleets, will require network designs tha...
Wormhole Detection Based on Z-Score And Neighbor Table Comparison	Zezhi Zeng	2025-05-14	下载	Wormhole attacks can cause serious disruptions to the network topology in disaster rescue opportunity networks. By establishing false Wormhole(WH) links, malicious nodes can mislead legitimate paths...
Instant AoI Optimization through Relay Location Selection in Disaster Multi-hop Communication	Yang Gao, Zezhi Zeng	2025-05-14	下载	Meteorological disasters such as typhoons, forest fires, and floods can damage the communication infrastructures, which will further disable the communication capabilities of cellular networks.
DNS Query Forgery: A Client-Side Defense Against Mobile App Traffic Profiling	Andrea Jimenez-Berenguel, César Gil, Carlos Garcia-Rubio, Jordi Forné, Celeste Campo	2025-05-14	下载	Mobile applications continuously generate DNS queries that can reveal sensitive user behavioral patterns even when communications are encrypted.
RAG-Enabled Intent Reasoning for Application-Network Interaction	Salwa Mostafa, Mohamed K. Abdel-Aziz, Mohammed S. Elbamby, Mehdi Bennis	2025-05-14	下载	Intent-based network (IBN) is a promising solution to automate network operation and management. IBN aims to offer human-tailored network interaction, allowing the network to communicate in a way that...
Interplay Between AI and Space-Air-Ground Integrated Network: The Road Ahead	Chenyu Wu, Xi Wang, Yi Hu, Shuai Han, Dusit Niyato	2025-05-14	下载	Space-air-ground integrated network (SAGIN) is envisioned as a key network architecture for achieving ubiquitous coverage in the next-generation communication system.
QUIC Steps: Evaluating Pacing Strategies in QUIC Implementations	Marcel Kempf, Simon Tietz, Benedikt Jaeger, Johannes Späth, Georg Carle, Johannes Zirngibl	2025-05-14	下载	Pacing is a key mechanism in modern transport protocols, used to regulate packet transmission timing to minimize traffic burstiness, lower latency, and reduce packet loss.
A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning	Berkay Guler, Giovanni Geraci, Hamid Jafarkhani	2025-05-14	下载	Current applications of self-supervised learning to wireless channel representation often borrow paradigms developed for text and image processing, without fully addressing the unique characteristics ...

cs.OS - Operating Systems

标题	作者	发布日期	PDF	摘要
Adaptive Migration Decision for Multi-Tenant Memory Systems	Hyungjun Cho, Igjae Kim, Kwanghoon Choi, Hongjin Kim, Wonjae Lee, Junhyeok Im, Jinin So, Jaehyuk Huh	2025-05-14	下载	Tiered memory systems consisting of fast small memory and slow large memory have emerged to provide high capacity memory in a cost-effective way.

cs.PF - Performance

标题	作者	发布日期	PDF	摘要
Statistical Modeling and Uncertainty Estimation of LLM Inference Systems	Kaustabha Ray, Nelson Mimura Gonzalez, Bruno Wassermann, Rachel Tzoref-Brill, Dean H. Lorenz	2025-05-14	下载	Large Language Model (LLM) inference systems present significant challenges in statistical performance characterization due to dynamic workload variations, diverse hardware architectures, and complex ...