Hongru (Merlin) Wang
Research Associate @ EdinburghNLP and EdinburghAI
I am currently a research associate (postdoc) at University of Edinburgh, working closely with Prof. Amos Storkey and Prof. Jeff Z. Pan, working on theory of agent (mainly planning, memory, self-evolving). I received PhD degree from The Chinese University of Hong Kong under the supervision of Prof. Kam-Fai Wong (ACL Fellow). I spent wonderful time at BlenderLab at University of Illinois Urbana-Champaign during my Ph.D study. I work closely with Prof. Heng Ji, Prof. Irwin King and Prof. Mengdi Wang. Besides that, I am also co-founder and organizer of Nexus for IntelligeCE (NICE), which provides a platform to share and discuss recent progress in AI & NLP for our more than 150,000 fans at the internet.
My research focus revolves around Theory of Agent (ToA), which unifying internal reasoning and external acting (a.k.a., two major behaviors) of agent as two epistemically equivalent tools to model the internal world stored in the parametric space and external physical world. My long-term objective is to achieve the impossible triangle between safety (env), personalization (user) and autonomy (agent) to learn from interactions internally or externally. For further information, please see my CV (last update: 2026.05.06).
Learning of Agent — under-thinking & under-acting
Research Questions
- What does an agent need to learn that can’t be compressed into parameters?
- Will over-delegation erode internal reasoning capability over time?
- How do reasoning, acting, environments, and time scale jointly?
Representative Works
- ToolRL: Reward is All Tool Learning Needs (NeurIPS 2025)
- From Word to World: Can Large Language Models be Implicit Text-based World Models? (ACL 2026)
- A Survey of Self-Evolving Agents (TMLR 2026)
Behavior of Agent — over-thinking & over-acting
Research Questions
- Why do agents fail to recognize their own knowledge boundary?
- When should an agent stop reasoning, stop acting, or stop both?
- What makes a behavior miscalibrated rather than simply wrong?
Representative Works
Evaluation of Agent — safety, personalization, reward modeling
Research Questions
- What does a correct answer hide about the process that produced it?
- How should a reward model reason, not just score?
- Can an agent be safe, personalized, and autonomous at once?
Representative Works
- SafeToolBench: Pioneering a Prospective Benchmark to Evaluating Tool Utilization Safety in LLMs (EMNLP Findings 2025)
- AppBench: Planning of Multiple APIs from Various APPs for Complex User Instruction (EMNLP 2024)
- ToolSpectrum: Towards Personalized Tool Utilization for Large Language Models (ACL Findings 2025)
- RM-R1: Reward Modeling as Reasoning (ICLR 2026)
Agent Applications
I will be on the job market starting in Aug 2026 and am open to both academic faculty positions and industrial research roles. If you believe I might be a good fit for your institution or organization, I’d love to connect!
news
| May 01, 2026 | We have 3 papers accepted by ICML 2026, including Theory of Agent, Search-R2, and HistBench. Congrats to all authors, As agents enter the second half, I believe this is not only an engineering challenge, but also a scientific journey toward understanding intelligence itself. Feeling grateful, happy, and energized for the road ahead. |
|---|---|
| Apr 08, 2026 | We have 8 papers accepted by ACL 2026: 7 Main and 1 Findings, including 3 corresponding-author and 1 first-author papers. More Details can be found here. Congrats to all co-authors! |
| Mar 20, 2026 | Our initial efficient reasoning work: AdaCtrl is accepted by TMLR 2026. Congrats to all co-authors! |
| Jan 25, 2026 | We have RM-R1 and PAPO accepted by ICLR 2026, Congratulations to all co-authors! It has been a truly memorable time at UIUC. |
| Dec 30, 2025 | We have two surveys: The Landscape of Agentic Reinforcement Learning for LLMs: A Survey and A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence accepted by TMLR 2026, Congratulations to all co-authors! This is my first time leading such a large collaboration involving researchers from around the world. |
selected preprints
selected publications
-
Transactions on Machine Learning Research, 2026 - In ACL Findings, 2025
-
In SIGIR, Washington DC, USA, 2024 Abs -
In EMNLP, 2024 Abs - In NeurIPS, 2024
-
In EMNLP Findings, 2023 - In EMNLP Findings, 2023 Best Paper