Appearance
2025-05-14
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors | Nicolas Dupuis, Ravi Nair, Shyam Ramji, Sean McClintock, Nishant Chauhan, Priyanka Nagpal, Bart Blaner, Ken Valk, Leon Stok, Ruchir Puri | 2025-05-14 | 下载 | The use of Large Language Models (LLMs) in hardware design has taken off in recent years, principally through its incorporation in tools that increase chip designer productivity. |
| SEGA-DCIM: Design Space Exploration-Guided Automatic Digital CIM Compiler with Multiple Precision Support | Haikang Diao, Haoyi Zhang, Jiahao Song, Haoyang Luo, Yibo Lin, Runsheng Wang, Yuan Wang, Xiyuan Tang | 2025-05-14 | 下载 | Digital computing-in-memory (DCIM) has been a popular solution for addressing the memory wall problem in recent years. However, the DCIM design still heavily relies on manual efforts, and the optimiza... |
| Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures | Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Huazuo Gao, Jiashi Li, Liyue Zhang, Panpan Huang, Shangyan Zhou, Shirong Ma, Wenfeng Liang, Ying He, Yuqing Wang, Yuxuan Liu, Y. X. Wei | 2025-05-14 | 下载 | The rapid scaling of large language models (LLMs) has unveiled critical limitations in current hardware architectures, including constraints in memory capacity, computational efficiency, and interconn... |
| Automated SAR ADC Sizing Using Analytical Equations | Zhongyi Li, Zhuofu Tao, Yanze Zhou, Yichen Shi, Zhiping Yu, Ting-Jung Lin, Lei He | 2025-05-14 | 下载 | Conventional analog and mixed-signal (AMS) circuit designs heavily rely on manual effort, which is time-consuming and labor-intensive. This paper presents a fully automated design methodology for Succ... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| FAST: An Efficient Scheduler for All-to-All GPU Communication | Yiran Lei, Dongjoo Lee, Liangyu Zhao, Daniar Kurniawan, Chanmyeong Kim, Heetaek Jeong, Changsu Kim, Hyeonseong Choi, Liangcheng Yu, Arvind Krishnamurthy, Justine Sherry, Eriko Nurvitadhi | 2025-05-14 | 下载 | All-to-All(v) communication is a critical primitive in modern machine learning workloads, particularly mixture-of-experts (MoE) models. Unfortunately, efficient scheduling is challenging due to worklo... |
| MDTP -- An Adaptive Multi-Source Data Transfer Protocol | Sepideh Abdollah, Craig Partridge, Susmit Shannigrahi | 2025-05-14 | 下载 | Scientific data volume is growing in size, and as a direct result, the need for faster transfers is also increasing. The scientific community has sought to leverage parallel transfer methods using mul... |
| IoT-Enabled Hemodynamic Surveillance System: AD8232 Bioelectric Signal Processing with ESP32 | Hemalatha R J, Shubham Malhotra, Shivapanchakshari T G, Lokesh K, Dev Anand D, Samson Jebakumar S | 2025-05-14 | 下载 | This dissertation proposes an electrocardiogram (ECG) tracking device that diagnoses cardiopulmonary problems using the Internet of Things (IoT) desired results. |
| ARM SVE Unleashed: Performance and Insights Across HPC Applications on Nvidia Grace | Ruimin Shi, Gabin Schieffer, Maya Gokhale, Pei-Hung Lin, Hiren Patel, Ivy Peng | 2025-05-14 | 下载 | Vector architectures are essential for boosting computing throughput. ARM provides SVE as the next-generation length-agnostic vector extension beyond traditional fixed-length SIMD. |
| Strategies to Measure Energy Consumption Using RAPL During Workflow Execution on Commodity Clusters | Philipp Thamm, Ulf Leser | 2025-05-14 | 下载 | In science, problems in many fields can be solved by processing datasets using a series of computationally expensive algorithms, sometimes referred to as workflows. |
| Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures | Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Huazuo Gao, Jiashi Li, Liyue Zhang, Panpan Huang, Shangyan Zhou, Shirong Ma, Wenfeng Liang, Ying He, Yuqing Wang, Yuxuan Liu, Y. X. Wei | 2025-05-14 | 下载 | The rapid scaling of large language models (LLMs) has unveiled critical limitations in current hardware architectures, including constraints in memory capacity, computational efficiency, and interconn... |
| Efficient Graph Embedding at Scale: Optimizing CPU-GPU-SSD Integration | Zhonggen Li, Xiangyu Ke, Yifan Zhu, Yunjun Gao, Feifei Li | 2025-05-14 | 下载 | Graph embeddings map graph nodes to continuous vectors and are foundational to community detection, recommendation, and many scientific applications. |
| Birch SGD: A Tree Graph Framework for Local and Asynchronous SGD Methods | Alexander Tyurin, Danil Sivtsov | 2025-05-14 | 下载 | We propose a new unifying framework, Birch SGD, for analyzing and designing distributed SGD methods. The central idea is to represent each method as a weighted directed tree, referred to as a computat... |
| Towards Efficient Verification of Parallel Applications with Mc SimGrid | Matthieu Laurent, Thierry Jéron, Martin Quinson | 2025-05-14 | 下载 | Assessing the correctness of distributed and parallel applications is notoriously difficult due to the complexity of the concurrent behaviors and the difficulty to reproduce bugs. |
| ELIS: Efficient LLM Iterative Scheduling System with Response Length Predictor | Seungbeom Choi, Jeonghoe Goo, Eunjoo Jeon, Mingyu Yang, Minsung Jang | 2025-05-14 | 下载 | We propose ELIS, a serving system for Large Language Models (LLMs) featuring an Iterative Shortest Remaining Time First (ISRTF) scheduler designed to efficiently manage inference tasks with the shorte... |
| Toward Malicious Clients Detection in Federated Learning | Zhihao Dou, Jiaqi Wang, Wei Sun, Zhuqing Liu, Minghong Fang | 2025-05-14 | 下载 | Federated learning (FL) enables multiple clients to collaboratively train a global machine learning model without sharing their raw data. However, the decentralized nature of FL introduces vulnerabili... |
| Architecture of Tianyu Software: Relative Photometry as a Case Study | Yicheng Rui, Yifan Xuan, Shuyue Zheng, Kexin Li, Kaiming Cui, Kai Xiao, Jie Zheng, Jun Kai Ng, Hongxuan Jiang, Fabo Feng, Qinghui Sun | 2025-05-14 | 下载 | Tianyu telescope, an one-meter robotic optical survey instrument to be constructed in Lenghu, Qinghai, China, is designed for detecting transiting exoplanets, variable stars and transients. |
| The Adaptive Complexity of Finding a Stationary Point | Huanjian Zhou, Andi Han, Akiko Takeda, Masashi Sugiyama | 2025-05-14 | 下载 | In large-scale applications, such as machine learning, it is desirable to design non-convex optimization algorithms with a high degree of parallelization. |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| FAST: An Efficient Scheduler for All-to-All GPU Communication | Yiran Lei, Dongjoo Lee, Liangyu Zhao, Daniar Kurniawan, Chanmyeong Kim, Heetaek Jeong, Changsu Kim, Hyeonseong Choi, Liangcheng Yu, Arvind Krishnamurthy, Justine Sherry, Eriko Nurvitadhi | 2025-05-14 | 下载 | All-to-All(v) communication is a critical primitive in modern machine learning workloads, particularly mixture-of-experts (MoE) models. Unfortunately, efficient scheduling is challenging due to worklo... |
| The Power of Alternatives in Network Embedding | Oleg Kolosov, Gala Yadgar, David Breitgand, Dean H. Lorenz | 2025-05-14 | 下载 | In the virtual network embedding problem, the goal is to map embed a set of virtual network instances to a given physical network substrate at minimal cost, while respecting the capacity constraints o... |
| MDTP -- An Adaptive Multi-Source Data Transfer Protocol | Sepideh Abdollah, Craig Partridge, Susmit Shannigrahi | 2025-05-14 | 下载 | Scientific data volume is growing in size, and as a direct result, the need for faster transfers is also increasing. The scientific community has sought to leverage parallel transfer methods using mul... |
| Dimensioning and Optimization of Reliability Coverage in Local 6G Networks | Jacek Kibiłda, Dian Echevarría Pérez, André Gomes, Onel L. Alcaraz López, Arthur S. de Sena, Nurul Huda Mahmood, Hirley Alves | 2025-05-14 | 下载 | Enabling vertical use cases for the sixth generation (6G) wireless networks, such as automated manufacturing, immersive extended reality (XR), and self-driving fleets, will require network designs tha... |
| Wormhole Detection Based on Z-Score And Neighbor Table Comparison | Zezhi Zeng | 2025-05-14 | 下载 | Wormhole attacks can cause serious disruptions to the network topology in disaster rescue opportunity networks. By establishing false Wormhole(WH) links, malicious nodes can mislead legitimate paths... |
| Instant AoI Optimization through Relay Location Selection in Disaster Multi-hop Communication | Yang Gao, Zezhi Zeng | 2025-05-14 | 下载 | Meteorological disasters such as typhoons, forest fires, and floods can damage the communication infrastructures, which will further disable the communication capabilities of cellular networks. |
| DNS Query Forgery: A Client-Side Defense Against Mobile App Traffic Profiling | Andrea Jimenez-Berenguel, César Gil, Carlos Garcia-Rubio, Jordi Forné, Celeste Campo | 2025-05-14 | 下载 | Mobile applications continuously generate DNS queries that can reveal sensitive user behavioral patterns even when communications are encrypted. |
| RAG-Enabled Intent Reasoning for Application-Network Interaction | Salwa Mostafa, Mohamed K. Abdel-Aziz, Mohammed S. Elbamby, Mehdi Bennis | 2025-05-14 | 下载 | Intent-based network (IBN) is a promising solution to automate network operation and management. IBN aims to offer human-tailored network interaction, allowing the network to communicate in a way that... |
| Interplay Between AI and Space-Air-Ground Integrated Network: The Road Ahead | Chenyu Wu, Xi Wang, Yi Hu, Shuai Han, Dusit Niyato | 2025-05-14 | 下载 | Space-air-ground integrated network (SAGIN) is envisioned as a key network architecture for achieving ubiquitous coverage in the next-generation communication system. |
| QUIC Steps: Evaluating Pacing Strategies in QUIC Implementations | Marcel Kempf, Simon Tietz, Benedikt Jaeger, Johannes Späth, Georg Carle, Johannes Zirngibl | 2025-05-14 | 下载 | Pacing is a key mechanism in modern transport protocols, used to regulate packet transmission timing to minimize traffic burstiness, lower latency, and reduce packet loss. |
| A Multi-Task Foundation Model for Wireless Channel Representation Using Contrastive and Masked Autoencoder Learning | Berkay Guler, Giovanni Geraci, Hamid Jafarkhani | 2025-05-14 | 下载 | Current applications of self-supervised learning to wireless channel representation often borrow paradigms developed for text and image processing, without fully addressing the unique characteristics ... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Adaptive Migration Decision for Multi-Tenant Memory Systems | Hyungjun Cho, Igjae Kim, Kwanghoon Choi, Hongjin Kim, Wonjae Lee, Junhyeok Im, Jinin So, Jaehyuk Huh | 2025-05-14 | 下载 | Tiered memory systems consisting of fast small memory and slow large memory have emerged to provide high capacity memory in a cost-effective way. |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Statistical Modeling and Uncertainty Estimation of LLM Inference Systems | Kaustabha Ray, Nelson Mimura Gonzalez, Bruno Wassermann, Rachel Tzoref-Brill, Dean H. Lorenz | 2025-05-14 | 下载 | Large Language Model (LLM) inference systems present significant challenges in statistical performance characterization due to dynamic workload variations, diverse hardware architectures, and complex ... |