About me

Welcome to my homepage! I am Weihang Su (苏炜航), a third-year PhD student at the Department of Computer Science and Technology, Tsinghua University, under the supervision of Prof. Yiqun Liu.

My research focuses on leveraging AI technology (LLMs) to better meet users’ information needs, specifically in the following areas:

Retrieval Augmented Generation
Knowledge Injection and Editing for LLMs
Hallucination Detection and Mitigation for LLMs
AI for Legal Applications

I am currently exploring an interesting direction: Parametric Retrieval-Augmented Generation (Parametric RAG, Paper Code ). We propose a new RAG paradigm that directly injects external knowledge into the parameters of large language models (LLMs) rather than relying on traditional in-context knowledge injection that appends retrieved documents to the LLM’s input. By parameterizing documents and integrating them into the model during the inference stage, Parametric RAG improves the overall performance of the RAG system and online efficiency while maintaining flexibility.

I am also passionate about mentoring undergraduate students in research. I’ve collaborated with undergraduate students like Changyue Wang, Yichen Tang, Anzhe Xie, Baoqing Yue, Jiaqing Wu, and Junxi Yan, co-authoring high-quality papers at top-tier conferences and journals such as ACL, SIGIR, EMNLP, TOIS, SIGIR-AP, etc.

If you are an undergraduate interested in my research areas and aim to publish high-quality papers, you can apply for an internship with the THUIR group through official channels or contact me (WeChat ID: rdfzswh) directly to embark on meaningful research together!

News

🎉 Our SIGIR 2025 tutorial on Dynamic and Parametric RAG has been accepted! Join us in 📍 Padua on July 13 to explore the next generation of Retrieval-Augmented Generation! More information at: https://sites.google.com/view/sigir2025-tutorial-dprag/home-page

Our Paper DecKER, has been accepted at ACL 2025! Congratulations to Changyue!
My first-authored Tutorial on RAG has been accepted at SIGIR 2025!
Our Paper RbFT has been accepted at SIGIR 2025! Congratulations to Yiteng!
My first-authored Long Paper JuDGE has been accepted at SIGIR 2025!
My first-authored Long Paper Parametric RAG has been accepted at SIGIR 2025!
My first-authored Long Paper Caseformer has been accepted at TOIS 2025!
My first-authored Long Paper DRAD has been accepted at SIGIR-AP 2024!
My first-authored Long Paper STARD has been accepted at EMNLP 2024!
Our paper “Scaling Laws for Dense Retrieval” received the SIGIR Best Paper Award!
My first-authored paper “DRAGIN” has been selected for an Oral presentation at ACL! (Top 2.6% in submissions, top 6.8% in accepted papers)
My first-authored Long Paper DRAGIN has been accepted at ACL 2024 Main Conference!
My first-authored Long Paper MIND has been accepted at Findings of ACL 2024!
My first-authored Long Paper Wikiformer has been accepted at AAAI 2024!
Our team participated in COLIEE 2023 and won the championship! Here is the link to our Technical Report: https://arxiv.org/abs/2304.12650
We participated in the WSDM Cup 2023 and won silver medals in two tasks! Here is the news report: https://www.cs.tsinghua.edu.cn/info/1088/5286.htm

Selected Awards

ICLR Notable Reviewer
China Association for Science and Technology’s Young Talents Project, Ph.D. Program (中国科协青年人才托举工程)
National Scholarship (国家奖学金), Top 4 in the Department of CST, Tsinghua University, 2024
SIGIR 2024 Best Paper Award
ACL 2024 Oral
Winner of the Language and Intelligence Challenge (LIC) Contest (4 winners in the world. Ranked 0.3% among all teams).
Beijing Outstanding Undergraduate (2022)

Publications

Link to Google Scholar

The titles of my first-author papers are in bold (excluding co-first where the ranking is not first).

Paper Under Submission

Efficient Parametric Knowledge Injection On-the-Fly for Dynamic Retrieval Augmented Generation
Weihang Su, Baoqing Yue, Qingyao Ai, Yichen Tang, Changyue Wang, Jiacheng Kang, Jingtao Zhan, Lin Fen, Liu Qin, Yiqun Liu
(Long Paper)
Plug-in Parameter Generation for Test-Time Parametric Knowledge Injection
Weihang Su, Jiaqing Wu, Qingyao Ai, Hanwen Zhang, Jiaxin Mao, Yiqun Liu
(Long Paper)
Benchmarking Computer Science Survey Generation
Weihang Su, Anzhe Xie, Qingyao Ai, Jianming Long, Jiaxin Mao, Yiqun Liu
(Long Paper) Code and Dataset
Joint Evaluation of Answer and Reasoning Consistency for Hallucination Detection in Large Reasoning Models
Changyue Wang, Weihang Su, Qingyao Ai, Yiqun Liu
(Long Paper) Paper Code
Augmenting Multi-Agent Communication with State Delta Trajectory
Yichen Tang, Weihang Su, Yujia Zhou, Yiqun Liu, Min Zhang, Shaoping Ma, Qingyao Ai
(Long Paper) Code
Knowledge Editing through Chain-of-Thought
Changyue Wang, Weihang Su, Qingyao Ai, Yiqun Liu
(Long Paper) Paper Code
Equity vs. Equality: Optimizing Ranking Fairness for Tailored Provider Needs
Yiteng Tu, Weihang Su, Shuguang Han, Yiqun LIU, Min Zhang, Shaoping Ma, Qingyao Ai
(Long Paper)

Year 2025

Decoupling Reasoning and Knowledge Injection for In-Context Knowledge Editing
Changyue Wang, Weihang Su, Qingyao Ai, Yujia Zhou, Yiqun Liu
ACL 2025 Findings (Long, CCF-A, THU-A) Paper Code
Dynamic and Parametric Retrieval-Augmented Generation
Weihang Su, Qingyao Ai, Jingtao Zhan, Qian Dong, Yiqun Liu
SIGIR 2025 (Tutorial, CCF-A, THU-A) Official Website Tutorial Proposal Paper
Parametric Retrieval Augmented Generation
Weihang Su, Yichen Tang, Qingyao Ai, Junxi Yan, Changyue Wang, Hongning Wang, Ziyi Ye, Yujia Zhou, Yiqun Liu
SIGIR 2025 (Long Paper, CCF-A, THU-A) Paper Code
JuDGE: Benchmarking Judgment Document Generation for Chinese Legal System
Weihang Su, Baoqing Yue, Qingyao Ai, Yiran Hu, Jiaqi Li, Changyue Wang, Kaiyuan Zhang, Yueyue Wu, Yiqun Liu
SIGIR 2025 (Long Paper, CCF-A, THU-A) Paper Code and Dataset
RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects
Yiteng Tu, Weihang Su, Yujia Zhou, Yiqun Liu, Qingyao Ai
SIGIR 2025 (Long Paper, CCF-A, THU-A) Paper Code
Caseformer: Pre-training for Legal Case Retrieval Based on Inter-Case Distinctions.
Weihang Su, Qingyao Ai, Yueyue Wu, Anzhe Xie, Changyue Wang, Yixiao Ma, Haitao Li, Zhijing Wu, Yiqun Liu, Min Zhang.
ACM Transactions on Information Systems
TOIS 2025 (Long Paper, CCF-A, THU-A)
DecoupledRAG: An Efficient and Effective Retrieval Augmented Generation Framework via Cross Attention.
Qian Dong, Qingyao Ai, Hongning Wang, Yiding Liu, Haitao Li, Weihang Su, Yiqun Liu, Tat-Seng Chua, Shaoping Ma.
ACM Transactions on Information Systems
WWW 2025 (Long Paper, CCF-A, THU-A)

Year 2024

Mitigating Entity-Level Hallucinations in Large Language Models
Weihang Su, Yichen Tang, Qingyao Ai, Zhijing Wu, Yiqun Liu
International ACM SIGIR Conference on Information Retrieval in the Asia Pacific
SIGIR-AP 2024 (Long Paper) Paper Code
LeKUBE: A Legal Knowledge Update BEnchmark
Changyue Wang, Weihang Su, Hu Yiran, Qingyao Ai, Yueyue Wu, Cheng Luo, Yiqun Liu, Min Zhang, Shaoping Ma
International ACM SIGIR Conference on Information Retrieval in the Asia Pacific
SIGIR-AP 2024 (Long Paper) Paper Code
STARD: A Chinese Statute Retrieval Dataset with Real Queries Issued by Non-professionals
Weihang Su, Yiran Hu, Anzhe Xie, Qingyao Ai, Zibing Que, Yun Liu, Weixing Shen, Yiqun LIU
The 2024 Conference on Empirical Methods in Natural Language Processing
EMNLP 2024 Findings (Long Paper, CCF-B, THU-A) Paper Code
DRAGIN: Dynamic Retrieval Augmented Generation based on the Real-time Information Needs of Large Language Models
Weihang Su, Yichen Tang, Qingyao Ai, Zhijing Wu, Yiqun Liu.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics.
ACL 2024 Main Oral (Long Paper, CCF-A, THU-A)
[Paper] Code
Unsupervised real-time hallucination detection based on the internal states of large language models
Weihang Su, Changyue Wang, Qingyao Ai, Yiran Hu, Zhijing Wu, Yujia Zhou, Yiqun Liu.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics.
ACL 2024 Findings (Long Paper, CCF-A, THU-A)
Paper Code
Scaling Laws For Dense Retrieval.
Yan Fang, Jingtao Zhan, Qingyao Ai, Jiaxin Mao, Weihang Su, Jia Chen and Yiqun Liu.
The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
SIGIR 2024 Best Paper Award (Long Paper, CCF-A, THU-A)
Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-hoc Retrieval.
Weihang Su, Qingyao Ai, Xiangsheng Li, Jia Chen, Yiqun Liu, Xiaolong Wu and Shengluan Hou.
The 38th Annual AAAI Conference on Artificial Intelligence
AAAI 2024 (Long Paper, CCF-A, THU-A)
Paper Code
Relevance Feedback with Brain Signals.
Ziyi Ye, Xiaohui Xie, Qingyao Ai, Yiqun Liu, Zhihong Wang, Weihang Su and Min Zhang.
ACM Transactions on Information Systems
TOIS 2024 (Long Paper, CCF-A, THU-A)

Before 2024

CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding.
Yixiao Ma, Yueyue Wu, Weihang Su, Qingyao Ai, Yiqun Liu.
The 2023 Conference on Empirical Methods in Natural Language Processing
EMNLP 2023 Main (Long Paper, CCF-B, THU-A)
THUIR2 at NTCIR-16 Session Search (SS) Task
Weihang Su, Xiangsheng Li, Yiqun Liu, Min Zhang, Shaoping Ma
NII Testbeds and Community for Information access Research Project
NTCIR 2022
Web Search via an Efficient and Effective Brain-Machine Interface.
Xuesong Chen, Ziyi Ye, Xiaohui Xie, Yiqun Liu, Xiaorong Gao, Weihang Su, Shuqi Zhu, Yike Sun, Min Zhang, and Shaoping Ma.
The 15th ACM International Conference on Web Search and Data Mining.
(WSDM 2022) (Demo Paper, CCF-B, THU-A)
Trade or trick? detecting and characterizing scam tokens on uniswap decentralized exchange
Pengcheng Xia, Haoyu Wang, Bingyu Gao, Weihang Su, Zhou Yu, Xiapu Luo, Chao Zhang, Xusheng Xiao, Guoai Xu
International Conference on Measurement and Modeling of Computer Systems
(SIGMETRICS 2022) (Long Paper, CCF-B, THU-A)