Hongru (Merlin) Wang
Research Associate @ EdinburghNLP and EdinburghAI
I am currently a research associate (postdoc) at University of Edinburgh, working closely with Prof. Amos Storkey and Prof. Jeff Z. Pan, working on theory of agents (mainly planning, memory, self-evolving), world model-based RL.
I received PhD degree from The Chinese University of Hong Kong under the supervision of Prof. Kam-Fai Wong (ACL Fellow). I spent wonderful time at EdinburghNLP and BlenderLab at University of Edinburgh and University of Illinois Urbana-Champaign during my Ph.D study. I work closely with Prof. Jeff Z. Pan, Prof. Heng Ji, and Prof. Mengdi Wang. I am co-founder and organizer of NLP Academic Exchange Platform (NICE), which provides a platform to share and discuss recent progress in AI & NLP.
My research focus revolves around Theory of Agent (ToA), which unifying internal reasoning and external acting (a.k.a., two major behaviors) of agent as two epistemically equivalent tools to model the internal world stored in the parametric space and external physical world. Where Theory of Mind (ToM) refers to the ability to attribute mental states (e.g., beliefs, intentions, knowledge) to oneself and others, enabling the prediction and interpretation of behavior, ToA characterizes an agent’s capacity to model not only external environments but also its own internal knowledge state to make decisions and complete the goal. My long-term objective is to achieve the impossible triangle between safety, personalization and autonomy of language agent to learn from interactions internally or externally. For further information, please see my CV (last update: 2026.01.30).
Mentorship: I have very close connections with CUHK, UoE, UIUC and Princeton. If you like my research or would like to copperate / visit, you can directly contact me via X or Wechat. My mentees always publish better paper than me :)
I will be on the job market starting in Mar 2026 and am open to both academic faculty positions and industrial research roles. If you believe I might be a good fit for your institution or organization, I’d love to connect!
news
| Jan 25, 2026 | We have RM-R1 and PAPO accepted by ICLR 2026, Congratulations to all co-authors! It has been a truly memorable time at UIUC. |
|---|---|
| Dec 30, 2025 | We have two surveys: The Landscape of Agentic Reinforcement Learning for LLMs: A Survey and A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence accepted by TMLR 2026, Congratulations to all co-authors! This is my first time leading such a large collaboration involving researchers from around the world. |
| Dec 15, 2025 | We will organize the first work about lifelong agent at ICLR 2026 with amazing invited speakers and lots of paper rewards. Weclome to submit your work! More details can be found in official website: Lifelong Agent. |
| Sep 20, 2025 | We have ToolRL accepted by NeurIPS 2025, OTC-PO and Physics Supernova papers accepted by LAW Workshop@NeurIPS 2025, and Physics Supernova is also slected as oral at LLM Evaluation Workshop@NeurIPS 2025. |
| Aug 20, 2025 | We have 4 papers accepted by EMNLP 2025: 1 Main and 3 Findings, including 2 co-first author: DecisionFlow and SafeToolBench. This is the first time that we get 100% acceptance with all ARR submissions. Congrats to all authors! |
selected preprints
- Arxiv
- Arxiv
- ArxivAlita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution2025
- Arxiv
- ArxivHarnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models2025
selected publications
- UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language ModelsIn ACL, 2025
- Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning CatalystIn ACL Findings, 2025
- Rethinking Stateful Tool Use in Multi-Turn Dialogues: Benchmarks and ChallengesIn ACL Findings, 2025
-
- NILLI Best Paper @
IDFLarge Language Models as Source Planner for Personalized Knowledge-grounded DialoguesIn EMNLP Findings, 2023