Hongru (Merlin) Wang

prof_pic.jpg

I am currently final-year Ph.D. candidate under the supervision of Prof. Kam-Fai Wong at Department of System Engineering and Engineering Management of The Chinese University of Hong Kong.

I received Bachelor’s degree and Master’s degree from Communication University of China and The Chinese University of Hong Kong respectively. I spent wonderful time at EdinburghNLP and BlenderLab at University of Edinburgh and University of Illinois Urbana-Champaign during my Ph.D study, and I work closely with Prof. Jeff Z. Pan, Prof. Heng Ji, and Prof. Mengdi Wang. I am co-founder and organizer of NLP Academic Exchange Platform (NICE), which provides a platform to share and discuss recent progress in AI & NLP.

My research focus revolves around Theory of Agent (ToA), which unifying internal reasoning and external acting (a.k.a., two major behaviors) of agent as two epistemically equivalent tools to model the internal world stored in the parametric space and external physical world. Where Theory of Mind (ToM) refers to the ability to attribute mental states (e.g., beliefs, intentions, knowledge) to oneself and others, enabling the prediction and interpretation of behavior, ToA characterizes an agent’s capacity to model not only external environments but also its own internal knowledge state to make decisions and complete the goal. My long-term objective is to achieve the impossible triangle between safety, personalization and autonomy of language agent to learn from interactions internally or externally. For further information, please see my CV (last update: 2025.05.30).

I will be on the job market starting in Fall 2025 and am open to both academic faculty positions and industrial research roles. If you believe I might be a good fit for your institution or organization, I’d love to connect!

news

May 25, 2025 We have 9 papers accepted by ACL 2025: 3 Main and 6 Findings, including 1 corresponding author, 2 first-author papers and 3 (co)first-author papers :sparkles: :smile:
Apr 22, 2025 We are so exited to introduce OTC-PO and ToolRL. We believe OTC-PO will be the foundation of agentic RL like the ReAct of Agent.
Jan 20, 2025 We have 3 papers are accepted by NAACL 2025, including one first author work: Self-DC that empower language agent when to rely on internal knowledge and when to call external tools.
Dec 30, 2024 Start my visiting at BLENDER Lab at University of Illinois Urbana-Champaign hosted by Prof. Heng Ji. It is also the final period of my Ph.D. study.
Sep 30, 2024 We have 1 paper accepted by NeurIPS 2024 and 1 paper accepted by MINT Workshop@NeurIPS 2024 about knowledge conflict and process reward model. Congratulations to all co-authors. This is my first paper at ML top conferences.

selected preprints

  1. Arxiv
    Acting Less is Reasoning More! Teaching Model to Act Efficiently
    Hongru Wang, Cheng Qian, Wanjun Zhong, Xiusi Chen, Jiahao Qiu, Shijue Huang, Bowen Jin, Mengdi Wang, Kam-Fai Wong, and Heng Ji
    2025
  2. Arxiv
    Toward a Theory of Agents as Tool-Use Decision-Makers
    Hongru Wang, Cheng Qian, Manling Li, Jiahao Qiu, Boyang Xue, Mengdi Wang, Heng Ji, and Kam-Fai Wong
    2025
  3. Arxiv
    Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution
    Jiahao Qiu, Xuan Qi, Tongcheng Zhang, Xinzhe Juan, Jiacheng Guo, Yifu Lu, Yimin Wang, Zixin Yao, Qihan Ren, Xun Jiang, Xing Zhou, Dongrui Liu, Ling Yang, Yue Wu, Kaixuan Huang, Shilong Liu, Hongru Wang, and Mengdi Wang
    2025
  4. Arxiv
    RM-R1: Reward Modeling as Reasoning
    Xiusi Chen, Gaotang Li, Ziqi Wang, Bowen Jin, Cheng Qian, Yu Wang, Hongru Wang, Yu Zhang, Denghui Zhang, Tong Zhang, Hanghang Tong, and Heng Ji
    2025
  5. Arxiv
    ToolRL: Reward is All Tool Learning Needs
    Cheng Qian, Emre Can Acikgoz, Qi He, Hongru Wang, Xiusi Chen, Dilek Hakkani-Tür, Gokhan Tur, and Heng Ji
    2025
  6. Arxiv
    AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting
    Shijue Huang*, Hongru Wang*, Wanjun Zhong, Zhaochen Su, Jiazhan Feng, Bowen Cao, and Yi R. Fung
    2025
  7. Arxiv
    Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models
    Rui Wang*, Hongru Wang*, Boyang Xue*, Jianhui Pang, Shudong Liu, Yi Chen, Jiahao Qiu, Derek Fai Wong, Heng Ji, and Kam-Fai Wong
    2025

selected publications

  1. UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models
    Boyang Xue, Fei Mi, Qi Zhu, Hongru Wang, Rui Wang, Sheng Wang, Erxin Yu, Xuming Hu, and Kam-Fai Wong
    In ACL, 2025
  2. Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst
    Hongru Wang, Deng Cai, Wanjun Zhong, Shijue Huang, Jeff Z. Pan, Zeming Liu, and Kam-Fai Wong
    In ACL Findings, 2025
  3. Rethinking Stateful Tool Use in Multi-Turn Dialogues: Benchmarks and Challenges
    Hongru Wang, Wenyu Huang, Yufei Wang, Yuanhao Xi, Jianqiao Lu, Huan Zhang, Nan Hu, Zeming Liu, Jeff Z. Pan, and Kam-Fai Wong
    In ACL Findings, 2025
  4. Oral
    Self-DC: When to Reason and When to Act? Self Divide-and-Conquer for Compositional Unknown Questions
    Hongru Wang, Boyang Xue, Baohang Zhou, Tianhua Zhang, Cunxiang Wang, Huimin Wang, Guanhua Chen, and Kam-Fai Wong
    In NAACL, 2025
  5. Oral
    Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
    Yu Zhao, Alessio Devoto, Giwon Hong, Xiaotang Du, Aryo Pradipta Gema, Hongru Wang, Xuanli He, Kam-Fai Wong, and Pasquale Minervini
    In NAACL, 2025
  6. Tutorial
    Empowering Large Language Models: Tool Learning for Real-World Interaction
    Hongru Wang, Yujia Qin, Yankai Lin, Jeff Z. Pan, and Kam-Fai Wong
    In SIGIR, Washington DC, USA, 2024
  7. Knowledge Conflicts for LLMs: A Survey
    Rongwu Xu, Zehan Qi, Zhijiang Guo, Cunxiang Wang, Hongru Wang, Yue Zhang, and Wei Xu
    In EMNLP, 2024
  8. AutoPSV: Automated Process-Supervised Verifier
    Jianqiao Lu, Zhiyang Dou, Hongru Wang, Zeyu Cao, Jianbo Dai, Yunlong Feng, and Zhijiang Guo
    In NeurIPS, 2024
  9. AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction
    Hongru Wang, Rui Wang, Boyang Xue, Heming Xia, Jingtao Cao, Zeming Liu, Jeff Z. Pan, and Kam-Fai Wong
    In EMNLP, 2024
  10. Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue Questions with LLMs
    Hongru Wang, Rui Wang, Fei Mi, Yang Deng, Zezhong Wang, Bin Liang, Ruifeng Xu, and Kam-Fai Wong
    In EMNLP Findings, 2023
  11. NILLI Best Paper @
    IDF
    Large Language Models as Source Planner for Personalized Knowledge-grounded Dialogues
    Hongru Wang, Minda Hu, Yang Deng, Rui Wang, Fei Mi, Weichao Wang, Yasheng Wang, Wai-Chung Kwan, Irwin King, and Kam-Fai Wong
    In EMNLP Findings, 2023