Zongzhang Zhang @ NJU-AI

Zongzhang Zhang

Ph.D., Associate Professor
LAMDA Group
School of Artificial Intelligence
National Key Laboratory for Novel Software Technology
Nanjing University, P. R. China

Office: Room A503, Yi Fu Building, Xianlin Campus
Email: zzzhang@nju.edu.cn, zhangzongzhang@gmail.com

Short Bio

I am currently an Associate Professor at the School of Artificial Intelligence, Nanjing University, and a member of the LAMDA group led by Prof. Zhi-Hua Zhou. From July 2014 to June 2019, I served as an Associate Professor at the School of Computer Science and Technology, Soochow University. I received my bachelor's degree in mathematics from Central South University in 2007 and my Ph.D. degree in Computer Science from University of Science and Technology of China in 2012, under the supervision of Prof. Xiaoping Chen.

My research experience includes appointments as a Visiting Scholar at the Stanford Intelligent Systems Laboratory (SISL) with Prof. Mykel J. Kochenderfer (Sept. 2018 – Mar. 2019), and as a Research Fellow at the School of Computing, National University of Singapore (Nov. 2012 – Jun. 2014), working with Prof. David Hsu and Prof. Wee Sun Lee. Earlier, I was a Visiting Student at the Rutgers Laboratory for Real-Life Reinforcement Learning (RL³) directed by Prof. Michael L. Littman (Oct. 2010 – Oct. 2011). I also briefly worked as a Research Engineer at Huawei's Noah's Ark Lab in 2012.

[Curriculum Vitae] [中文简历]

Research Interests

My research interests mainly include artificial intelligence and machine learning. Now I am working on

Reinforcement Learning (RL), including deep RL, transfer RL, data-driven RL, visual RL, safe RL, and RL for large models
Multi-agent systems, e.g., multi-agent RL, multi-agent communication, and multi-agent coordination
Probabilistic planning, particularly in partially observable Markov decision processes
Imitation Learning (IL), including IL via generative models, adversarial IL, non-adversarial IL, and multi-agent IL

Selected Publications

(* indicates corresponding author)

Generalizable Multi-modal Adversarial Imitation Learning for Non-stationary Dynamics [Paper]
Yi-Chen Li, Ningjing Chao, Zongzhang Zhang*, Fuxiang Zhang, Lei Yuan, and Yang Yu
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025, 47(7): 5600-5612.
Learning to Coordinate with Different Teammates via Team Probing [Paper]
Hao Ding, Chengxing Jia, Zongzhang Zhang*, Cong Guan, Feng Chen, Lei Yuan, and Yang Yu
IEEE Transactions on Neural Networks and Learning Systems, 2025, 36(9): 15807-15821.
Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models [Online]
Fuxiang Zhang, Junyou Li, Yi-Chen Li, Zongzhang Zhang*, Yang Yu, and Deheng Ye*
IEEE Transactions on Neural Networks and Learning Systems, 2025, 08: 1-12.
Efficient Multi-Agent Cooperation Learning through Teammate Lookahead [Paper]
Feng Chen, Xinwei Chen, Rong-Jun Qin, Cong Guan, Lei Yuan, Zongzhang Zhang*, and Yang Yu
Transactions on Machine Learning Research, 2025, 03: 1-27.
Behavior-Regularized Diffusion Policy Optimization for Offline Reinforcement Learning [Paper] [Code] [Project Page]
Chen-Xiao Gao, Chenyang Wu, Mingjun Cao, Chenjun Xiao, Yang Yu, and Zongzhang Zhang*
In: Proceedings of the 42nd International Conference on Machine Learning (ICML-2025), Vancouver, Canada, 2025.
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation [Paper] [Code]
Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu, Lei Yuan, Chengxing Jia, Zongzhang Zhang*, Yang Yu, and Bo An
In: Proceedings of the 13th International Conference on Learning Representations (ICLR-2025), Singapore, 2025.
Multi-Agent Domain Calibration with a Handful of Offline Data [Paper] [Code]
Tao Jiang, Lei Yuan, Lihe Li, Cong Guan, Zongzhang Zhang*, and Yang Yu
In: Advances in Neural Information Processing Systems 37 (NeurIPS-2024), pages 69607-69636, Vancouver, Canada, 2024.
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics [Paper] [Code]
Xinyu Zhang, Wenjie Qiu, Yi-Chen Li, Lei Yuan, Chengxing Jia, Zongzhang Zhang*, and Yang Yu
In: Proceedings of the 41st International Conference on Machine Learning (ICML-2024), pages 59741-59758, Vienna, Austria, 2024.
Efficient and Stable Offline-to-online Reinforcement Learning via Continual Policy Revitalization [Paper] [Appendix] [Code]
Rui Kong, Chenyang Wu, Chen-Xiao Gao, Zongzhang Zhang*, and Ming Li
In: Proceedings of the 33rd International Joint Conference on Artificial Intelligence (IJCAI-2024), pages 4317-4325, Jeju Island, South Korea, 2024.
Focus-Then-Decide: Segmentation-Assisted Reinforcement Learning [Paper] [Appendix] [Code] [Project Page]
Chao Chen, Jiacheng Xu, Weijian Liao, Hao Ding, Zongzhang Zhang*, Yang Yu, and Rui Zhao
In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI-2024), pages 11240–11248, Vancouver, Canada, 2024.
ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning [Paper] [Appendix] [Code]
Chen-Xiao Gao, Chenyang Wu, Mingjun Cao, Rui Kong, Zongzhang Zhang*, and Yang Yu
In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI-2024), pages 12127–12135, Vancouver, Canada, 2024.
Generalizable Task Representation Learning for Offline Meta-Reinforcement Learning with Data Limitations [Paper] [Appendix] [Code]
Renzhe Zhou, Chen-Xiao Gao, Zongzhang Zhang*, and Yang Yu
In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI-2024), pages 17132-17140, Vancouver, Canada, 2024.
Deep Anomaly Detection via Active Anomaly Search [Paper] [Appendix] [Code]
Chao Chen, Dawei Wang, Feng Mao, Jiacheng Xu, Zongzhang Zhang*, and Yang Yu
In: Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS-2024), pages 308–316, Auckland, New Zealand, 2024.
Surfing Information: The Challenge of Intelligent Decision-Making [Paper]
Chenyang Wu and Zongzhang Zhang*
Intelligent Computing, 2023, 2: Article 0041.
Policy Regularization with Dataset Constraint for Offline Reinforcement Learning [Paper] [Code]
Yuhang Ran, Yi-Chen Li, Fuxiang Zhang, Zongzhang Zhang*, and Yang Yu
In: Proceedings of the 40th International Conference on Machine Learning (ICML-2023), pages 28701-28717, Honolulu, Hawaii, USA, 2023.
Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data [Paper] [Code]
Fuxiang Zhang, Chengxing Jia, Yi-Chen Li, Lei Yuan, Yang Yu, and Zongzhang Zhang*
In: Proceedings of the 11th International Conference on Learning Representations (ICLR-2023), Kigali, Rwanda, 2023.
Internal Logical Induction for Pixel-Symbolic Reinforcement Learning [Paper] [Code]
Jiacheng Xu, Chao Chen, Fuxiang Zhang, Lei Yuan, Zongzhang Zhang*, and Yang Yu
In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD-2023), pages 2825–2837, Long Beach, CA, USA, 2023.
Policy-Independent Behavioral Metric-Based Representation for Deep Reinforcement Learning [Paper] [Appendix]
Weijian Liao, Zongzhang Zhang*, and Yang Yu
In: Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI-2023), pages 8746-8754, Washington, DC, USA, 2023.
Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning [Paper] [Appendix]
Chenyang Wu, Tianci Li, Zongzhang Zhang*, and Yang Yu
In: Advances in Neural Information Processing Systems 35 (NeurIPS-2022), pages 14210-14223, New Orleans, USA, 2022.
Efficient Multi-Agent Communication via Shapley Message Value [Paper] [Code] [Demo]
Di Xue, Lei Yuan, Zongzhang Zhang*, and Yang Yu
In: Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI-2022), pages 578-584, Vienna, Austria, 2022.
Multi-Agent Incentive Communication via Decentralized Teammate Modeling [Paper] [Code] [Demo]
Lei Yuan, Jianhao Wang, Fuxiang Zhang, Chenghe Wang, Zongzhang Zhang*, Yang Yu, and Chongjie Zhang*
In: Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI-2022), pages 9466-9474, Virtual Conference, 2022.
Adaptive Online Packing-guided Search for POMDPs [Paper] [Appendix] [Code]
Chenyang Wu, Guoyu Yang, Zongzhang Zhang*, Yang Yu, Dong Li, Wulong Liu, and Jianye Hao
In: Advances in Neural Information Processing Systems 34 (NeurIPS-2021), pages 28419-28430, Virtual Conference, 2021.
Cross-Modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning [Paper] [Appendix] [Code]
Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang*, and Yang Yu
In: Advances in Neural Information Processing Systems 34 (NeurIPS-2021), pages 12520-12532, Virtual Conference, 2021.

[Full List of Publications] [DBLP] [Google Scholar] [Code Repositories]

Selected Patents

章宗长, 俞扬, 周志华, 王铖鹤, 袁雷, 张福翔, 秦熔均. 基于任务表征和队友感知的多智能体协作方法和装置. 发明, 2025, 专利号: ZL 202210624473.3
章宗长, 俞扬, 周志华, 周韧哲. 基于模型不确定性与行为先验的控制策略离线训练方法. 发明, 2025, 专利号: ZL 202310064893.5
章宗长, 俞扬, 孔祥瀚. 基于部分可观测强化学习的机器人导航控制方法及系统. 中国发明专利, 2025, 专利号: ZL202210366719.1
章宗长, 陈浩然, 王艺深, 沈永亮. Recognition System for Security Check and Control Method Thereof. 美国发明专利, 2023, 专利号: US11574152 B2
章宗长, 潘致远, 王辉. Large Area Surveillance Method and Surveillance Robot Based on Weighted Double Deep Q-learning. 美国发明专利, 2022, 专利号: US11224970 B2
章宗长, 俞扬, 周志华, 吴晨阳, 杨国钰. 基于自适应粒子与信念填充的部分可观察驾驶规划方法. 中国发明专利, 2022, 专利号: ZL202110410291.1
章宗长, 俞扬, 周志华, 胡亚飞, 徐峰. 基于元强化学习的车辆自适应的自动驾驶决策方法及系统. 中国发明专利, 2022, 专利号: ZL202110356309.4
章宗长, 廖沩健, 俞扬, 黎铭, 周志华. 基于粒子注意力深度Q学习的部分观测路口自主并道方法. 中国发明专利, 2022, 专利号: ZL202110337809.3
章宗长, 俞扬, 姜冲. 基于第三人称模仿学习的机械臂动作学习方法及系统. 中国发明专利, 2022, 专利号: ZL202010040178.4
章宗长, 俞扬, 周志华, 王艺深, 蒋俊鹏. 基于部分可观测迁移强化学习的自动驾驶决策方法及系统. 中国发明专利, 2021, 专利号: ZL201911373375.1

[Full List of Patents]

Ongoing Projects

国家自然科学基金面上项目 "基于知识迁移的合作型多智能体深度强化学习研究", No. 62276126, 2023.1-2026.12
阿里巴巴（北京）软件服务有限公司 "自动出价领域的生成式模型和基础模型中关键技术研究", 2025.3-2026.3
北京青阳智维科技有限公司 "算力高效的强化学习对齐方法", 2024.9-2025.9
腾讯科技（深圳）有限公司 "提升风控策略生成质量的强化学习关键技术研究", 2025.7-2026.6
科技创新2030-"新一代人工智能"重大项目 "面向流数据的机器学习理论与方法", No. 2022ZD0114800, 2022.12-2025.11
国家自然科学基金原创探索计划项目 "学件的关键技术研究", No. 62250069, 2023.1-2025.12

[Full List of Projects]

Professional Services

Editorial Board Member: Intelligent Computing (AAAS/Science Partner Journal, 2022-2025)
Young Associate Editor: Frontiers of Computer Science (2020-2025)
Area Chair: NeurIPS 2024-2025; IJCAI 2025; ICLR 2026
Senior Program Committee Member: AAMAS 2024; IJCAI 2020-2021; AAAI 2019; ICAPS 2021; ECAI 2020, 2024, 2025
Program Committee Member/Reviewer: AAAI; ICML; IJCAI; NeurIPS; ICLR [Full List]
Journal Reviewer: Transactions on Pattern Analysis and Machine Intelligence; Artificial Intelligence; Journal of Artificial Intelligence Research [Full List]
Workshop Co-chair: Asian Workshop on Reinforcement Learning (AWRL) 2016-2018, PRICAI 2018's Workshop on Methods and Applications of Reinforcement Learning
Local Organizing Committee Chair: DAI 2020; MLA 2020, 2022
Professional Organization Membership: CCF Distinguished Member; AAAI Member; IEEE Member
Reviewer Award: ICLR 2021's Outstanding Reviewer; NeurIPS 2019, 2022's Top Reviewer
Consultant/Visiting Scholar: Polixir Technologies; Alibaba Group (2021-2022); Netease (2017-2020)

Teaching

Multi-Agent Systems (for undergraduate students, Spring 2021-2025) [textbook]
Big Data, Large Model, and Decision Intelligence (for undergraduate students, Spring 2024-2025)
Introduction to Artificial Intelligence (for undergraduate students, Fall 2021-2025, with Prof. Yang Yu) [textbook][course home]
Control Theory and Methods (for undergraduate and graduate students, Fall 2020-2024) [textbook]
Reinforcement Learning (for undergraduate and graduate students, Fall 2020-2023, with Prof. Yang Yu) [textbook][course home]
Intelligent Systems: Design and Application (for undergraduate and graduate students, Spring 2020-2021) [textbook]
Intelligent Application Modeling (for undergraduate students, July 2019) [a summer course co-constructed with Tencent]

Students

Ph.D. Students:

2022 - : Weijian Liao 廖沩健 (co-supervised with Prof. Ming Li)
2023 - : Chenyang Wu 吴晨阳, Di Xue 薛迪
2024 - : Aoran Wang 王傲然 (co-supervised with Prof. Yang Yu)
2025 - : Tao Jiang 江涛 (co-supervised with Prof. Yang Yu)

Master Students:

[More Information on Current Students and Alumni]

To prospective students:

I am in a LAMDA's reinforcement learning team (LAMDA RL Lab) with Prof. Yang Yu.

I am looking for self-driven, diligent, adaptable, and resourceful students to work on exciting research in machine learning, including topics of reinforcement learning, multi-agent systems, probabilistic planning, imitation learning, etc. If you are passionate about research, you are welcome to contact me.

Mail:
National Key Laboratory for Novel Software Technology, Nanjing University, Xianlin Campus Mailbox 603, 163 Xianlin Avenue, Qixia District, Nanjing 210023, China
(In Chinese:) 南京市栖霞区仙林大道163号，太阳成tyc122cc仙林校区603信箱，计算机软件新技术全国重点实验室，210023。

Created on September 11, 2019