Appearance
2026-01-31
cs.AR - Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ENFOR-SA: End-to-end Cross-layer Transient Fault Injector for Efficient and Accurate DNN Reliability Assessment on Systolic Arrays | Rafael Billig Tonetto, Marcello Traiola, Fernando Fernandes dos Santos, Angeliki Kritikakou | 2026-01-31 | 下载 | Recent advances in deep learning have produced highly accurate but increasingly large and complex DNNs, making traditional fault-injection techniques impractical. |
| Exploration of Unary Arithmetic-Based Matrix Multiply Units for Low Precision DL Accelerators | Prabhu Vellaisamy, Harideep Nair, Di Wu, Shawn Blanton, John Paul Shen | 2026-01-31 | 下载 | General matrix multiplication (GEMM) is a fundamental operation in deep learning (DL). With DL moving increasingly toward low precision, recent works have proposed novel unary GEMM designs as an alter... |
| AutoGNN: End-to-End Hardware-Driven Graph Preprocessing for Enhanced GNN Performance | Seungkwan Kang, Seungjun Lee, Donghyun Gouk, Miryeong Kwon, Hyunkyu Choi, Junhyeok Jang, Sangwon Lee, Huiwon Choi, Jie Zhang, Wonil Choi, Mahmut Taylan Kandemir, Myoungsoo Jung | 2026-01-31 | 下载 | Graph neural network (GNN) inference faces significant bottlenecks in preprocessing, which often dominate overall inference latency. We introduce AutoGNN, an FPGA-based accelerator designed to address... |
| HyperOffload: Graph-Driven Hierarchical Memory Management for Large Language Models on SuperNode Architectures | Fangxin Liu, Qinghua Zhang, Hanjing Shen, Zhibo Liang, Li Jiang, Haibing Guan, Chong Bao, Xuefeng Jin | 2026-01-31 | 下载 | The rapid evolution of Large Language Models (LLMs) towards long-context reasoning and sparse architectures has pushed memory requirements far beyond the capacity of individual device HBM. |
cs.DC - Distributed, Parallel, and Cluster Computing
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| Fast Sparse Matrix Permutation for Mesh-Based Direct Solvers | Behrooz Zarebavami, Ahmed H. Mahmoud, Ana Dodik, Changcheng Yuan, Serban D. Porumbescu, John D. Owens, Maryam Mehri Dehnavi, Justin Solomon | 2026-01-31 | 下载 | We present a fast sparse matrix permutation algorithm tailored to linear systems arising from triangle meshes. Our approach produces nested-dissection-style permutations while significantly reducing p... |
| System-Level Performance Modeling of Photonic In-Memory Computing | Jebacyril Arockiaraj, Sasindu Wijeratne, Sugeet Sunder, Md Abdullah-Al Kaiser, Akhilesh Jaiswal, Ajey P. Jacob, Viktor Prasanna | 2026-01-31 | 下载 | Photonic in-memory computing is a high-speed, low-energy alternative to traditional transistor-based digital computing that utilizes high photonic operating frequencies and bandwidths. |
| HyperOffload: Graph-Driven Hierarchical Memory Management for Large Language Models on SuperNode Architectures | Fangxin Liu, Qinghua Zhang, Hanjing Shen, Zhibo Liang, Li Jiang, Haibing Guan, Chong Bao, Xuefeng Jin | 2026-01-31 | 下载 | The rapid evolution of Large Language Models (LLMs) towards long-context reasoning and sparse architectures has pushed memory requirements far beyond the capacity of individual device HBM. |
| Forecasting Energy Availability in Local Energy Communities via LSTM Federated Learning | Fabio Turazza, Marcello Pietri, Natalia Selini Hadjidimitriou, Marco Mamei | 2026-01-31 | 下载 | Local Energy Communities are emerging as crucial players in the landscape of sustainable development. A significant challenge for these communities is achieving self-sufficiency through effective mana... |
| PROBE: Co-Balancing Computation and Communication in MoE Inference via Real-Time Predictive Prefetching | Qianchao Zhu, Xucheng Ye, Yuliang Liu, Haodong Ouyang, Chengru Song | 2026-01-31 | 下载 | Mixture-of-Experts models have become a dominant architecture for scaling Large Language Models by activating only a sparse subset of experts per token. |
| FedMOA: Federated GRPO for Personalized Reasoning LLMs under Heterogeneous Rewards | Ziyao Wang, Daeun Jung, Yexiao He, Guoheng Sun, Zheyu Shen, Myungjin Lee, Ang Li | 2026-01-31 | 下载 | Group Relative Policy Optimization (GRPO) has recently emerged as an effective approach for improving the reasoning capabilities of large language models through online multi-objective reinforcement l... |
| Stabilizing Decentralized Federated Fine-Tuning via Topology-Aware Alternating LoRA | Xiaoyu Wang, Xiaotian Li, Zhixiang Zhou, Chen Li, Yong Liu | 2026-01-31 | 下载 | Decentralized federated learning (DFL), a serverless variant of federated learning, poses unique challenges for parameter-efficient fine-tuning due to the factorized structure of low-rank adaptation (... |
cs.NI - Networking and Internet Architecture
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| LMTE: Putting the "Reasoning" into WAN Traffic Engineering with Language Models | Xinyu Yuan, Yan Qiao, Zonghui Wang, Meng Li, Wenzhi Chen | 2026-01-31 | 下载 | The rapid expansion of modern wide-area networks (WANs) has made traffic engineering (TE) increasingly challenging, as traditional solvers struggle to keep pace. |
| The Syntactic-Semantic Internet:Engineering Infrastructures for Autonomous Systems | Mallik Tatipamula, Xuesong Liu, Yao Sun, Muhammad Ali Imran | 2026-01-31 | 下载 | The Internet has evolved through successive architectural abstractions that enabled unprecedented scale, interoperability, and innovation. Packet-based networking enabled the reliable transport of bit... |
| How segmented is my network? | Rohit Dube | 2026-01-31 | 下载 | Network segmentation is a popular security practice for limiting lateral movement, yet practitioners lack a metric to measure how segmented a network actually is. |
| NetWorld: Communication-Based Diffusion World Model for Multi-Agent Reinforcement Learning in Wireless Networks | Kechen Meng, Rongpeng Li, Yansha Deng, Zhifeng Zhao, Honggang Zhang | 2026-01-31 | 下载 | As wireless communication networks grow in scale and complexity, diverse resource allocation tasks become increasingly critical. Multi-Agent Reinforcement Learning (MARL) provides a promising solution... |
cs.OS - Operating Systems
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| ProphetKV: User-Query-Driven Selective Recomputation for Efficient KV Cache Reuse in Retrieval-Augmented Generation | Shihao Wang, Jiahao Chen, Yanqi Pan, Hao Huang, Yichen Hao, Xiangyu Zou, Wen Xia, Wentao Zhang, Chongyang Qiu, Pengfei Wang | 2026-01-31 | 下载 | The prefill stage of long-context Retrieval-Augmented Generation (RAG) is severely bottlenecked by computational overhead. To mitigate this, recent methods assemble pre-calculated KV caches of retriev... |
cs.PF - Performance
| 标题 | 作者 | 发布日期 | 摘要 | |
|---|---|---|---|---|
| WritePolicyBench: Benchmarking Memory Write Policies under Byte Budgets | Edgard El Cham | 2026-01-31 | 下载 | We introduce WritePolicyBench, a benchmark for evaluating memory write policies: decision rules that choose what to store, merge, and evict under a strict byte budget while processing a stream with do... |