| Jan 25, 2026 | We have RM-R1 and PAPO accepted by ICLR 2026, Congratulations to all co-authors! It has been a truly memorable time at UIUC. |
| Dec 30, 2025 | We have two surveys: The Landscape of Agentic Reinforcement Learning for LLMs: A Survey and A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence accepted by TMLR 2026, Congratulations to all co-authors! This is my first time leading such a large collaboration involving researchers from around the world. |
| Dec 15, 2025 | We will organize the first workshop about lifelong agent at ICLR 2026 with amazing invited speakers and lots of paper rewards. Weclome to submit your work! More details can be found in official website: Lifelong Agent.     |
| Sep 20, 2025 | We have ToolRL accepted by NeurIPS 2025, OTC-PO and Physics Supernova papers accepted by LAW Workshop@NeurIPS 2025, and Physics Supernova is also selected as oral at LLM Evaluation Workshop@NeurIPS 2025. |
| Aug 20, 2025 | We have 4 papers accepted by EMNLP 2025: 1 Main and 3 Findings, including 2 co-first author: DecisionFlow and SafeToolBench. This is the first time that we get 100% acceptance with all ARR submissions. Congrats to all authors! |
| May 25, 2025 | We have 9 papers accepted by ACL 2025: 3 Main and 6 Findings, including 1 corresponding author, 2 first-author papers and 3 (co)first-author papers |
| Apr 22, 2025 | We are so exited to introduce OTC-PO and ToolRL. We believe OTC-PO will be the foundation of agentic RL like the ReAct of Agent. |
| Jan 20, 2025 | We have 3 papers are accepted by NAACL 2025, including one first author work: Self-DC that empower language agent when to rely on internal knowledge and when to call external tools. |
| Dec 30, 2024 | Start my visiting at BLENDER Lab at University of Illinois Urbana-Champaign hosted by Prof. Heng Ji. It is also the final period of my Ph.D. study. |
| Sep 30, 2024 | We have 1 paper accepted by NeurIPS 2024 and 1 paper accepted by MINT Workshop@NeurIPS 2024 about knowledge conflict and process reward model. Congratulations to all co-authors. This is my first paper at ML top conferences. |