Appearance
2025-03-31
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| SAGe: A Lightweight Algorithm-Architecture Co-Design for Mitigating the Data Preparation Bottleneck in Large-Scale Genome Sequence Analysis | Nika Mansouri Ghiasi, Talu Güloglu, Harun Mustafa, Can Firtina, Konstantina Koliogeorgi, Konstantinos Kanellopoulos, Haiyu Mao, Rakesh Nadig, Mohammad Sadrosadati, Jisung Park, Onur Mutlu | 2025-03-31 | 下载 | Genome sequence analysis, which examines the DNA sequences of organisms, drives advances in many critical medical and biotechnological fields. |
| PIM-LLM: A High-Throughput Hybrid PIM Architecture for 1-bit LLMs | Jinendra Malekar, Peyton Chandarana, Md Hasibul Amin, Mohammed E. Elbtity, Ramtin Zand | 2025-03-31 | 下载 | In this paper, we propose PIM-LLM, a hybrid architecture developed to accelerate 1-bit large language models (LLMs). PIM-LLM leverages analog processing-in-memory (PIM) architectures and digital systo... |
| SPRING: Systematic Profiling of Randomly Interconnected Neural Networks Generated by HLS | Rui Shi, Seda Ogrenci | 2025-03-31 | 下载 | Profiling is important for performance optimization by providing real-time observations and measurements of important parameters of hardware execution. |
| Banked Memories for Soft SIMT Processors | Martin Langhammer, George A. Constantinides | 2025-03-31 | 下载 | Recent advances in soft GPGPU architectures have shown that a small (<10K LUT), high performance (770 MHz) processor is possible in modern FPGAs. |
| ReaLM: Reliable and Efficient Large Language Model Inference with Statistical Algorithm-Based Fault Tolerance | Tong Xie, Jiawang Zhao, Zishen Wan, Zuodong Zhang, Yuan Wang, Runsheng Wang, Ru Huang, Meng Li | 2025-03-31 | 下载 | The demand for efficient large language model (LLM) inference has propelled the development of dedicated accelerators. As accelerators are vulnerable to hardware faults due to aging, variation, etc, e... |
| DiffuSE: Cross-Layer Design Space Exploration of DNN Accelerator via Diffusion-Driven Optimization | Yi Ren, Chenhao Xue, Jiaxing Zhang, Chen Zhang, Qiang Xu, Yibo Lin, Lining Zhang, Guangyu Sun | 2025-03-31 | 下载 | The proliferation of deep learning accelerators calls for efficient and cost-effective hardware design solutions, where parameterized modular hardware generator and electronic design automation (EDA) ... |
| DOMAC: Differentiable Optimization for High-Speed Multipliers and Multiply-Accumulators | Chenhao Xue, Yi Ren, Jinwei Zhou, Kezhi Li, Chen Zhang, Yibo Lin, Lining Zhang, Qiang Xu, Guangyu Sun | 2025-03-31 | 下载 | Multipliers and multiply-accumulators (MACs) are fundamental building blocks for compute-intensive applications such as artificial intelligence. |
| MVDRAM: Enabling GeMV Execution in Unmodified DRAM for Low-Bit LLM Acceleration | Tatsuya Kubo, Daichi Tokuda, Tomoya Nagatani, Masayuki Usui, Lei Qu, Ting Cao, Shinya Takamaeda-Yamazaki | 2025-03-31 | 下载 | General matrix-vector multiplication (GeMV) remains a critical latency bottleneck in large language model (LLM) inference, even with quantized low-bit models. |
| TuRTLe: A Unified Evaluation of LLMs for RTL Generation | Dario Garcia-Gasulla, Gokcen Kestor, Emanuele Parisi, Miquel Albertí-Binimelis, Cristian Gutierrez, Razine Moundir Ghorab, Orlando Montenegro, Bernat Homs, Miquel Moreto | 2025-03-31 | 下载 | The rapid advancements in LLMs have driven the adoption of generative AI in various domains, including Electronic Design Automation (EDA). Unlike traditional software development, EDA presents unique ... |
| Uni-Render: A Unified Accelerator for Real-Time Rendering Across Diverse Neural Renderers | Chaojian Li, Sixu Li, Linrui Jiang, Jingqun Zhang, Yingyan Celine Lin | 2025-03-31 | 下载 | Recent advancements in neural rendering technologies and their supporting devices have paved the way for immersive 3D experiences, significantly transforming human interaction with intelligent devices... |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| SAGe: A Lightweight Algorithm-Architecture Co-Design for Mitigating the Data Preparation Bottleneck in Large-Scale Genome Sequence Analysis | Nika Mansouri Ghiasi, Talu Güloglu, Harun Mustafa, Can Firtina, Konstantina Koliogeorgi, Konstantinos Kanellopoulos, Haiyu Mao, Rakesh Nadig, Mohammad Sadrosadati, Jisung Park, Onur Mutlu | 2025-03-31 | 下载 | Genome sequence analysis, which examines the DNA sequences of organisms, drives advances in many critical medical and biotechnological fields. |
| Rack Position Optimization in Large-Scale Heterogeneous Data Centers | Chang-Lin Chen, Jiayu Chen, Tian Lan, Zhaoxia Zhao, Hongbo Dong, Vaneet Aggarwal | 2025-03-31 | 下载 | As rapidly growing AI computational demands accelerate the need for new hardware installation and maintenance, this work explores optimal data center resource management by balancing operational effic... |
| GPU-centric Communication Schemes for HPC and ML Applications | Naveen Namashivayam | 2025-03-31 | 下载 | Compute nodes on modern heterogeneous supercomputing systems comprise CPUs, GPUs, and high-speed network interconnects (NICs). Parallelization is identified as a technique for effectively utilizing th... |
| Fermilab's Transition to Token Authentication | Dave Dykstra, Mine Altunay, Shreyas Bhat, Dmitry Litvintsev, Marco Mambelli, Marc Mengel, Stephen White | 2025-03-31 | 下载 | Fermilab is the first High Energy Physics institution to transition from X.509 user certificates to authentication tokens in production systems. |
| Enhancing Traffic Safety with AI and 6G: Latency Requirements and Real-Time Threat Detection | Kurt Horvath, Dragi Kimovski, Stojan Kitanov, Radu Prodan | 2025-03-31 | 下载 | The rapid digitalization of urban infrastructure opens the path to smart cities, where IoT-enabled infrastructure enhances public safety and efficiency. |
| Deep Learning Model Deployment in Multiple Cloud Providers: an Exploratory Study Using Low Computing Power Environments | Elayne Lemos, Rodrigo Oliveira, Jairson Rodrigues, Rosalvo F. Oliveira Neto | 2025-03-31 | 下载 | The deployment of Machine Learning models in the cloud has grown among tech companies. Hardware requirements are higher when these models involve Deep Learning techniques, and the cloud providers' cos... |
| A Practical Rollup Escape Hatch Design | Francisco Gomes Figueira, Martin Derka, Ching Lun Chiu, Jan Gorzny | 2025-03-31 | 下载 | A rollup network is a type of popular "Layer 2" scaling solution for general purpose "Layer 1" blockchains like Ethereum. Rollups networks separate execution of transactions from other aspects like co... |
| OrchMLLM: Orchestrate Multimodal Data with Batch Post-Balancing to Accelerate Multimodal Large Language Model Training | Yijie Zheng, Bangjun Xiao, Lei Shi, Xiaoyang Li, Faming Wu, Tianyu Li, Xuefeng Xiao, Yang Zhang, Yuxuan Wang, Shouda Liu | 2025-03-31 | 下载 | Multimodal large language models (MLLMs), such as GPT-4o, are garnering significant attention. During the exploration of MLLM training, we identified Modality Composition Incoherence, a phenomenon tha... |
| MVDRAM: Enabling GeMV Execution in Unmodified DRAM for Low-Bit LLM Acceleration | Tatsuya Kubo, Daichi Tokuda, Tomoya Nagatani, Masayuki Usui, Lei Qu, Ting Cao, Shinya Takamaeda-Yamazaki | 2025-03-31 | 下载 | General matrix-vector multiplication (GeMV) remains a critical latency bottleneck in large language model (LLM) inference, even with quantized low-bit models. |
| Who is in Charge here? Understanding How Runtime Configuration Affects Software along with Variables&Constants | Chaopeng Luo, Yuanliang Zhang, Haochen He, Zhouyang Jia, Teng Wang, Shulin Zhou, Si Zheng, Shanshan Li | 2025-03-31 | 下载 | Runtime misconfiguration can lead to software performance degradation and even cause failure. Developers typically perform sanity checks during the configuration parsing stage to prevent invalid param... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Rack Position Optimization in Large-Scale Heterogeneous Data Centers | Chang-Lin Chen, Jiayu Chen, Tian Lan, Zhaoxia Zhao, Hongbo Dong, Vaneet Aggarwal | 2025-03-31 | 下载 | As rapidly growing AI computational demands accelerate the need for new hardware installation and maintenance, this work explores optimal data center resource management by balancing operational effic... |
| Fair Dynamic Spectrum Access via Fully Decentralized Multi-Agent Reinforcement Learning | Yubo Zhang, Pedro Botelho, Trevor Gordon, Gil Zussman, Igor Kadota | 2025-03-31 | 下载 | We consider a decentralized wireless network with several source-destination pairs sharing a limited number of orthogonal frequency bands. Sources learn to adapt their transmissions (specifically, the... |
| Moving Edge for On-Demand Edge Computing: An Uncertainty-aware Approach | Fangtong Zhou, Ruozhou Yu | 2025-03-31 | 下载 | We study an edge demand response problem where, based on historical edge workload demands, an edge provider needs to dispatch moving computing units, e.g. |
| Traffic Engineering in Large-scale Networks with Generalizable Graph Neural Networks | Fangtong Zhou, Xiaorui Liu, Ruozhou Yu, Guoliang Xue | 2025-03-31 | 下载 | Traffic Engineering (TE) in large-scale networks like cloud Wide Area Networks (WANs) and Low Earth Orbit (LEO) satellite constellations is a critical challenge. |
| Trident: Interference Avoidance in Multi-reader Backscatter Network via Frequency-space Division | Yang Zou, Xin Na, Yimiao Sun, Yuan He | 2025-03-31 | 下载 | Backscatter is a key technology for battery-free sensing in industrial IoT applications. To fully cover numerous tags in the deployment area, one often needs to deploy multiple readers, each of which ... |
| Cell-Free Massive MIMO Under Mobility: A Fairness-Differentiated Handover Scheme | Yunlu Xiao, Marina Petrova, Ljiljana Simić | 2025-03-31 | 下载 | While cell-free massive MIMO (CF-mMIMO) offers high network-wide throughput in static networks, especially for the worst-served users, its performance in mobile networks is not yet fully addressed. |
| Robust Predictive Routing for Internet of Vehicles Leveraging Both V2I and V2V Links | Yawen Chang, Xudong Wang | 2025-03-31 | 下载 | With the developments of the Internet of Vehicles (IoV) from 4G to 5G, vehicle-to-infrastructure (V2I) communications are becoming attractive for vehicle users (VUEs) to obtain diverse cloud service t... |
| Blockchain for Federated Learning in the Internet of Things: Trustworthy Adaptation, Standards, and the Road Ahead | Farhana Javed, Engin Zeydan, Josep Mangues-Bafalluy, Kapal Dev, Luis Blanco | 2025-03-31 | 下载 | As edge computing gains prominence in Internet of Things (IoTs), smart cities, and autonomous systems, the demand for real-time machine intelligence with low latency and model reliability continues to... |
| Multi-Agent Deep Reinforcement Learning for Optimized Multi-UAV Coverage and Power-Efficient UE Connectivity | Xuli Cai, Poonam Lohan, Burak Kantarci | 2025-03-31 | 下载 | In critical situations such as natural disasters, network outages, battlefield communication, or large-scale public events, Unmanned Aerial Vehicles (UAVs) offer a promising approach to maximize wirel... |
| Optimizing Age of Information in Networks with Large and Small Updates | Zhuoyi Zhao, Vishrant Tripathi, Igor Kadota | 2025-03-31 | 下载 | Modern sensing and monitoring applications typically consist of sources transmitting updates of different sizes, ranging from a few bytes (position, temperature, etc. |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| HeteroPod: XPU-Accelerated Infrastructure Offloading for Commodity Cloud-Native Applications | Bicheng Yang, Jingkai He, Dong Du, Yubin Xia, Haibo Chen | 2025-03-31 | 下载 | Cloud-native systems increasingly rely on infrastructure services (e.g., service meshes, monitoring agents), which compete for resources with user applications, degrading performance and scalability. |
| Who is in Charge here? Understanding How Runtime Configuration Affects Software along with Variables&Constants | Chaopeng Luo, Yuanliang Zhang, Haochen He, Zhouyang Jia, Teng Wang, Shulin Zhou, Si Zheng, Shanshan Li | 2025-03-31 | 下载 | Runtime misconfiguration can lead to software performance degradation and even cause failure. Developers typically perform sanity checks during the configuration parsing stage to prevent invalid param... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Deep Learning Model Deployment in Multiple Cloud Providers: an Exploratory Study Using Low Computing Power Environments | Elayne Lemos, Rodrigo Oliveira, Jairson Rodrigues, Rosalvo F. Oliveira Neto | 2025-03-31 | 下载 | The deployment of Machine Learning models in the cloud has grown among tech companies. Hardware requirements are higher when these models involve Deep Learning techniques, and the cloud providers' cos... |