Skip to content

2025-11-27

cs.AR - Architecture

标题作者发布日期PDF摘要
3RSeT: Read Disturbance Rate Reduction in STT-MRAM Caches by Selective Tag ComparisonElham Cheshmikhani, Hamed Farbeh, Hossein Asad2025-11-27下载Recent development in memory technologies has introduced Spin-Transfer Torque Magnetic RAM (STT-MRAM) as the most promising replacement for SRAMs in on-chip cache memories.
From RISC-V Cores to Neuromorphic Arrays: A Tutorial on Building Scalable Digital Neuromorphic ProcessorsAmirreza Yousefzadeh2025-11-27下载Digital neuromorphic processors are emerging as a promising computing substrate for low-power, always-on EdgeAI applications. In this tutorial paper, we outline the main architectural design principle...
An Analytical and Empirical Investigation of Tag Partitioning for Energy-Efficient Reliable CacheElham Cheshmikhani, Hamed Farbeh2025-11-27下载Associative cache memory significantly influences processor performance and energy consumption. Because it occupies over half of the chip area, cache memory is highly susceptible to transient and perm...
FADiff: Fusion-Aware Differentiable Optimization for DNN Scheduling on Tensor AcceleratorsShuao Jia, Zichao Ling, Chen Bai, Kang Zhao, Jianwang Zhai2025-11-27下载Efficient deployment of Deep Neural Networks (DNNs), such as Large Language Models (LLMs), on tensor accelerators is essential for maximizing computational efficiency in modern AI systems.
Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIRYuyang Zou, Youwei Xiao, Yansong Xu, Chenyun Yin, Yuhao Luo, Yitian Sun, Ruifan Xu, Renze Chen, Yun Liang2025-11-27下载Application-Specific Instruction-Set Processors (ASIPs) built on the RISC-V architecture offer specialization opportunities for various applications.
CADC: Crossbar-Aware Dendritic Convolution for Efficient In-memory ComputingShuai Dong, Junyi Yang, Ye Ke, Hongyang Shang, Arindam Basu2025-11-27下载Convolutional neural networks (CNNs) are computationally intensive and often accelerated using crossbar-based in-memory computing (IMC) architectures.

cs.DC - Distributed, Parallel, and Cluster Computing

标题作者发布日期PDF摘要
Accelerating mesh-based Monte Carlo simulations using contemporary graphics ray-tracing hardwareShijie Yan, Douglas Dwyer, David R. Kaeli, Qianqian Fang2025-11-27下载Significance: Monte Carlo (MC) methods are the gold-standard for modeling light-tissue interactions due to their accuracy. Mesh-based MC (MMC) offers enhanced precision for complex tissue structures u...
A lasso-alternative to Dijkstra's algorithm for identifying short paths in networksAnqi Dong, Amirhossein Taghvaei, Tryphon T. Georgiou2025-11-27下载We revisit the problem of finding the shortest path between two selected vertices of a graph and formulate this as an 1\ell_1-regularized regression -- Least Absolute Shrinkage and Selection Operator...
Federated Learning Survey: A Multi-Level Taxonomy of Aggregation Techniques, Experimental Insights, and Future FrontiersMeriem Arbaoui, Mohamed-el-Amine Brahmia, Abdellatif Rahmoun, Mourad Zghal2025-11-27下载The integration of IoT and AI has unlocked innovation across industries, but growing privacy concerns and data isolation hinder progress. Traditional centralized ML struggles to overcome these challen...
DisCEdge: Distributed Context Management for Large Language Models at the EdgeMohammadreza Malekabbasi, Minghe Wang, David Bermbach2025-11-27下载Deploying Large Language Model (LLM) services at the edge benefits latency-sensitive and privacy-aware applications. However, the stateless nature of LLMs makes managing user context (e.g.
OmniInfer: System-Wide Acceleration Techniques for Optimizing LLM Serving Throughput and LatencyJun Wang, Yunxiang Yao, Wenwei Kuang, Runze Mao, Zhenhao Sun, Zhuang Tao, Ziyang Zhang, Dengyu Li, Jiajun Chen, Zhili Wang, Kai Cui, Congzhi Cai, Longwen Lan, Ken Zhang2025-11-27下载Large Language Models drive a wide range of modern AI applications but impose substantial challenges on large-scale serving systems due to intensive computation, strict latency constraints, and throug...
Post-Quantum-Resilient Audit Evidence for Long-Lived Regulated Systems: Security Models, Migration Patterns, and Case StudyLeo Kao2025-11-27下载Constant-size cryptographic evidence records are increasingly used to build audit trails for regulated AI workloads in clinical, pharmaceutical, and financial settings, where each execution is summari...
Optimality of Simultaneous Consensus with Limited Information Exchange (Extended Abstract)Kaya Alpturer, Ron van der Meyden, Sushmita Ruj, Godfrey Wong2025-11-27下载Work on the development of optimal fault-tolerant Agreement protocols using the logic of knowledge has concentrated on the "full information" approach to information exchange, which is costly with res...
PAT: Accelerating LLM Decoding via Prefix-Aware Attention with Resource Efficient Multi-Tile KernelJinjun Yi, Zhixin Zhao, Yitao Hu, Ke Yan, Weiwei Sun, Hao Wang, Laiping Zhao, Yuhao Zhang, Wenxin Li, Keqiu Li2025-11-27下载LLM serving is increasingly dominated by decode attention, which is a memory-bound operation due to massive KV cache loading from global memory.
When AI Bends Metal: AI-Assisted Optimization of Design Parameters in Sheet Metal FormingAhmad Tarraf, Koutaiba Kassem-Manthey, Seyed Ali Mohammadi, Philipp Martin, Lukas Moj, Semih Burak, Enju Park, Christian Terboven, Felix Wolf2025-11-27下载Numerical simulations have revolutionized the industrial design process by reducing prototyping costs, design iterations, and enabling product engineers to explore the design space more efficiently.
Silence Speaks Volumes: A New Paradigm for Covert Communication via History Timing PatternsChristoph Weissenborn, Steffen Wendzel2025-11-27下载A Covert Channel (CC) exploits legitimate communication mechanisms to stealthily transmit information, often bypassing traditional security controls.
A Fast and Flat Federated Learning Method via Weighted Momentum and Sharpness-Aware MinimizationTianle Li, Yongzhi Huang, Linshan Jiang, Chang Liu, Qipeng Xie, Wenfeng Du, Lu Wang, Kaishun Wu2025-11-27下载In federated learning (FL), models must \emph{converge quickly} under tight communication budgets while \emph{generalizing} across non-IID client distributions.
Byzantine Fault-Tolerant Multi-Agent System for Healthcare: A Gossip Protocol Approach to Secure Medical Message PropagationNihir Chadderwala2025-11-27下载Recent advances in generative AI have enabled sophisticated multi-agent architectures for healthcare, where large language models power collaborative clinical decision-making.
An Empirical Study of Cross-Language Interoperability in Replicated Data SystemsProvakar Mondal, Eli Tilevich2025-11-27下载BACKGROUND: Modern distributed systems replicate data across multiple execution sites. Business requirements and resource constraints often necessitate mixing different languages across replica sites.

cs.NI - Networking and Internet Architecture

标题作者发布日期PDF摘要
Smoothing Rough Edges of IPv6 in VPNsYejin Cho, John Heidemann2025-11-27下载How do commercial VPNs interact with IPv6? We show two "rough edges" in how commercial VPNs handle IPv6. First, we show that many IPv4-only VPNs leak IPv6 traffic to the ISP.
Day in the Life of RIPE Atlas: Operational Insights and Applications in Network MeasurementsYevheniya Nosyk, Malte Tashiro, Qasim Lone, Robert Kisteleki, Andrzej Duda, Maciej Korczyński2025-11-27下载Network measurement platforms are increasingly popular among researchers and operators alike due to their distributed nature, simplifying measuring the remote parts of the Internet.
Impact of Sociality Regimes on Quality of Service and Energy Efficiency in Cell-Free MIMO NetworksAla Eddine Nouali, Mohamed Sana, Jean-Paul Jamont2025-11-27下载The cell-free architecture represents a significant advancement in network design, where each User Equipment (UE) is served by a group of distributed Access Points (APs), aimed at delivering uniformly...
Semantic-Aware Caching for Efficient Image Generation in Edge ComputingHanshuai Cui, Zhiqing Tang, Zhi Yao, Weijia Jia, Wei Zhao2025-11-27下载Text-to-image generation employing diffusion models has attained significant popularity due to its capability to produce high-quality images that adhere to textual prompts.
RELiQ: Scalable Entanglement Routing via Reinforcement Learning in Quantum NetworksTobias Meuser, Jannis Weil, Aninda Lahiri, Marius Paraschiv2025-11-27下载Quantum networks are becoming increasingly important because of advancements in quantum computing and quantum sensing, such as recent developments in distributed quantum computing and federated quantu...
Silence Speaks Volumes: A New Paradigm for Covert Communication via History Timing PatternsChristoph Weissenborn, Steffen Wendzel2025-11-27下载A Covert Channel (CC) exploits legitimate communication mechanisms to stealthily transmit information, often bypassing traditional security controls.
Optimizing NetGPT via Routing-Based Synergy and Reinforcement LearningYuxuan Chen, Rongpeng Li, Xianfu Chen, Celimuge Wu, Chenghui Peng, Zhifeng Zhao, Honggang Zhang2025-11-27下载Large language model (LLM) agents at the network edge offer low-latency execution for routine queries. In contrast, complex requests often require the superior capability of cloud models, incurring hi...
AutoRec: Accelerating Loss Recovery for Live Streaming in a Multi-Supplier MarketTong Li, Xu Yan, Bo Wu, Cheng Luo, Fuyu Wang, Jiuxiang Zhu, Haoyi Fang, Xinle Du, Ke Xu2025-11-27下载Due to the limited permissions for upgrading dualside (i.e., server-side and client-side) loss tolerance schemes from the perspective of CDN vendors in a multi-supplier market, modern large-scale live...

cs.PF - Performance

标题作者发布日期PDF摘要
PET Rapid Image Reconstruction Challenge (PETRIC)Casper da Costa-Luis, Matthias J. Ehrhardt, Christoph Kolbitsch, Evgueni Ovtchinnikov, Edoardo Pasca, Kris Thielemans, Charalampos Tsoumpas2025-11-27下载Introduction: We describe the foundation of PETRIC, an image reconstruction challenge to minimise the computational runtime of related algorithms for Positron Emission Tomography (PET).
3RSeT: Read Disturbance Rate Reduction in STT-MRAM Caches by Selective Tag ComparisonElham Cheshmikhani, Hamed Farbeh, Hossein Asad2025-11-27下载Recent development in memory technologies has introduced Spin-Transfer Torque Magnetic RAM (STT-MRAM) as the most promising replacement for SRAMs in on-chip cache memories.
Motion-to-Motion Latency Measurement Framework for Connected and Autonomous Vehicle TeleoperationFrançois Provost, Faisal Hawlader, Mehdi Testouri, Raphaël Frank2025-11-27下载Latency is a key performance factor for the teleoperation of Connected and Autonomous Vehicles (CAVs). It affects how quickly an operator can perceive changes in the driving environment and apply corr...
What Is the Optimal Ranking Score Between Precision and Recall? We Can Always Find It and It Is Rarely F1F_1Sébastien Piérard, Adrien Deliège, Marc Van Droogenbroeck2025-11-27下载Ranking methods or models based on their performance is of prime importance but is tricky because performance is fundamentally multidimensional.
Edge Deployment of Small Language Models, a comprehensive comparison of CPU, GPU and NPU backendsPablo Prieto, Pablo Abad2025-11-27下载Edge computing processes data where it is generated, enabling faster decisions, lower bandwidth usage, and improved privacy. However, edge devices typically operate under strict constraints on process...
When AI Bends Metal: AI-Assisted Optimization of Design Parameters in Sheet Metal FormingAhmad Tarraf, Koutaiba Kassem-Manthey, Seyed Ali Mohammadi, Philipp Martin, Lukas Moj, Semih Burak, Enju Park, Christian Terboven, Felix Wolf2025-11-27下载Numerical simulations have revolutionized the industrial design process by reducing prototyping costs, design iterations, and enabling product engineers to explore the design space more efficiently.

基于 VitePress 构建