I am currently a principal researcher (资深算法研究员) at SenseTime Smart City Group (SCG). Before that, I was a principal researcher (主任研究员) at Huawei Noah's Ark Lab. I focus on bringing the intellegent decision-making techniques to the real-world. I am specifically interested in Reinforcement Learning (RL), Multi-Agent Reinforcement Learning (MARL) and Large Language Model (LLM). I serve as the (senior) program-committee-member of top AI conferences like NeurIPS/ICML and AAAI/IJCAI. I also serve as the executive-committee-member of CCF-AI-MAS (中国计算机学会-人工智能专委会-多智能体学组).
Education Experience
![]()
Ph.D. @ PKU
|
|
![]()
B.S. @ NUAA
|
|
Selected Work Experience
![]()
Principal Researcher (资深算法研究员)
|
|
![]() ![]()
Senior, then promoted to Principal Researcher (高级,主任研究员)
|
|
![]() ![]()
Research Intern
|
|
![]()
School-Enterprise Cooperation
|
|
Selected Award
Selected Publication
- Google Scholar: https://scholar.google.com/citations?user=EtVHsgcAAAAJ
-
* means equal contribution.
- Mingzhe Xing, Hangyu Mao, Shenglin Yin, Lichen Pang, Zhengchao Zhang, Zhen Xiao, and Jieyi Long. A Dual-Agent Scheduler for Distributed Deep Learning Jobs on Public Cloud via Reinforcement Learning. KDD 2023 (CCF-A).
- 郝建业, 邵坤, 李凯, 李栋, 毛航宇, 胡舒悦, 王震. 博弈智能的研究与应用. 中国科学:信息科学 2023.
- Jianye Hao, Xiaotian Hao, Hangyu Mao, Weixun Wang, Yaodong Yang, Dong Li, Yan Zheng, and Zhen Wang. Boosting Multiagent Reinforcement Learning via Permutation Invariant and Permutation Equivariant Networks. ICLR 2023.
- Ming Yan, Junjie Chen*, Hangyu Mao*, Jiajun Jiang, Jianye Hao, Xingjian Li, Zhao Tian, Zhichao Chen, Dong Li, Zhangkong Xian, Yanwei Guo, Wulong Liu, Bin Wang, Yuefeng Sun, and Yongshun Cui. Achieving Last-Mile Functional Coverage in Testing Chip Design Software Implementations. ICSE 2023 (CCF-A).
- Xianjie Zhang, Yu Liu, Hangyu Mao, and Chao Yu. Common Belief Multi-Agent Reinforcement Learning Based on Variational Recurrent Models. Neurocomputing 2022 (CCF-C).
- Wenhan Huang, Kai Li, Kun Shao, Tianze Zhou, Matthew E. Taylor, Jun Luo, Dongge Wang, Hangyu Mao, Jianye Hao, Jun Wang, and Xiaotie Deng. Multiagent Q-learning with Sub-Team Coordination. NeurIPS 2022 (CCF-A).
- Lichen Pan, Jun Qian, Wei Xia, Hangyu Mao, Jun Yao, PengZe Li, and Zhen Xiao. Optimizing Communication in Deep Reinforcement Learning with XingTian. Middleware 2022 (CCF-B).
- Jinpeng Li, Guangyong Chen, Hangyu Mao, Danruo Deng, Dong Li, Jianye Hao, Qi Dou, and Pheng-Ann Heng. Flat-aware Cross-stage Distilled Framework for Imbalanced Medical Image Classification. MICCAI 2022 (CCF-B). Provisional Accept Recommendation (Top 13%).
- Mingzhe Xing, Hangyu Mao, and Zhen Xiao. Fast and Fine-grained Autoscaler for Streaming Jobs with Reinforcement Learning. IJCAI 2022 (CCF-A).
- Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, and Li Wang. What About Inputing Policy in Value Function: Policy Representation and Policy-Extended Value Function Approximator. AAAI 2022 (CCF-A).
- Wenhan Huang, Kai Li, Kun Shao, Tianze Zhou, Jun Luo, Dongge Wang, Hangyu Mao, Jianye Hao, Jun Wang, and Xiaotie Deng. Multiagent Q-learning with Sub-Team Coordination. Extended Abstract at AAMAS 2022 (CCF-B).
- Hangyu Mao*, Chao Wang*, Xiaotian Hao*, Yihuan Mao*, Yiming Lu*, Chengjie Wu*, Jianye Hao, Dong Li, and Pingzhong Tang. SEIHAI: A Sample-Efficient Hierarchical AI for the MineRL Competition. DAI 2021. Champion Solution for NeurIPS20 MineRL Competition (Top 1 among 90+ teams).
- Tianpei Yang*, Weixun Wang*, Hongyao Tang*, Jianye Hao, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Chengwei Zhang, Yujing Hu, Yingfeng Chen, and Changjie Fan. An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning. NeurIPS 2021 (CCF-A).
- Xianjie Zhang, Yu Liu, Xiujuan Xu, Qiong Huang, Hangyu Mao, and Anil Carie. Structural Relational Inference Actor-Critic for Multi-Agent Reinforcement Learning. Neurocomputing 2021 (CCF-C).
- Changmin Yu, Dong Li, Hangyu Mao, Jianye Hao, and Neil Burgess. Learning State Representations via Temporal Cycle-Consistency Constraint in Model-Based Reinforcement Learning. SSL-RL Workshop at ICLR 2021.
- Guss William Hebgen, ..., Hangyu Mao, ..., et al. Towards Robust and Domain Agnostic Reinforcement Learning Competitions: MineRL 2020. NeurIPS 2020 Competition and Demonstration Track (CCF-A).
- Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong, and Yan Ni. Learning Multi-Agent Communication with Double Attentional Deep Reinforcement Learning. JAAMAS 2020 (CCF-B).
- Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, and Zhen Xiao. Neighborhood Cognition Consistent Multi-Agent Reinforcement Learning. AAAI 2020 (CCF-A). Long Oral Presentation (Top 5%).
- Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong, and Yan Ni. Learning Agent Communication under Limited Bandwidth by Message Pruning. AAAI 2020 (CCF-A).
- Hangyu Mao, Zhengchao Zhang, Zhen Xiao, and Zhibo Gong. Modelling the Dynamic Joint Policy of Teammates with Attention Multi-Agent DDPG. AAMAS 2019 (CCF-B).
- Hangyu Mao, Yang Xiao, Yuan Wang, Jiakang Wang, and Zhen Xiao. Topic-Specific Retweet Count Ranking for Weibo. PAKDD 2018 (CCF-C).
- Yuan Wang, Hangyu Mao, and Zhen Xiao. Identifying Influential Users’ Professions via the Microblogs They Forward. SocInf Workshop at IJCAI 2017 (CCF-A).
- Yang Xiao, Yuan Wang, Hangyu Mao, and Zhen Xiao. Predicting Restaurant Consumption Level through Social Media Footprints. COLING 2016 (CCF-B).
- Jingqing Ruan*, Yihong Chen*, Bin Zhang*, Zhiwei Xu*, Tianpeng Bao*, Guoqing Du*, Shiwei Shi*, Hangyu Mao*, Xingyu Zeng, and Rui Zhao. TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents. Arxiv 2023.
- Jingqing Ruan, Xiaotian Hao, Dong Li, and Hangyu Mao. Learning to Collaborate by Grouping: a Consensus-oriented Strategy for Multi-agent Reinforcement Learning. Arxiv 2023.
- Bin Zhang, Hangyu Mao, Lijuan Li, Zhiwei Xu, Dapeng Li, Rui Zhao, and Guoliang Fan. Stackelberg Decision Transformer for Asynchronous Action Coordination in Multi-Agent Systems. Arxiv 2023.
- Hangyu Mao, Rui Zhao, Hao Chen, Jianye Hao, Yiqun Chen, Dong Li, Junge Zhang, and Zhen Xiao. Transformer in Transformer as Backbone for Deep Reinforcement Learning. Arxiv 2022.
- Yiqun Chen, Hangyu Mao, Tianle Zhang, Shiguang Wu, Bin Zhang, Jianye Hao, Dong Li, Bin Wang, and Hongxing Chang. PTDE: Personalized Training with Distillated Execution for Multi-Agent Reinforcement Learning. Arxiv 2022.
- Tianze Zhou, Fubiao Zhang, Kun Shao, Kai Li, Wenhan Huang, Jun Luo, Weixun Wang, Yaodong Yang, Hangyu Mao, Bin Wang, Dong Li, Wulong Liu, and Jianye Hao. Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment. Arxiv 2021.
- Hangyu Mao, Zhibo Gong, and Zhen Xiao. Reward Design in Cooperative Multi-Agent Reinforcement Learning for Packet Routing. Arxiv 2020.
- Hangyu Mao, Zhibo Gong, Zhengchao Zhang, Zhen Xiao, and Yan Ni. Learning Multi-Agent Communication under Limited-Bandwidth Restriction for Internet Packet Routing. Arxiv 2019.
- Hangyu Mao, Zhibo Gong, Yan Ni, and Zhen Xiao. Actor-Coordinator-Critic Net for "Learning-to-Communicate" with Deep Multi-agent Reinforcement Learning. Arxiv 2017.
- Hangyu Mao, Wulong Liu, and Jianye Hao. Agent Training Method, Apparatus, and Computer-readable Storage Medium. U.S. Application No. 17877063, issued on Nov. 17, 2022. US Patent.
- 毛航宇、郭艳伟、冼章孔。确定芯片测试的输入的方法和装置。专利号:202210626411.6。已集成到实际产品。
- 陈俊洁、毛航宇、郝建业、孙月凤、姜佳君。一种芯片的测试用例生成方法、装置及存储介质。专利号:202111663515.6。已集成到实际产品。2022年4月被评为华为“潜在高价值”专利。
- 张正超、肖臻、毛航宇、潘丽晨。一种基于深度强化学习的集群资源管理和任务调度方法及系统。专利号:202010581407.3。
- 潘丽晨、毛航宇、肖臻、张正超。基于多智能体深度强化学习的集群资源调度方法及系统。专利号:202010322543.0。
- 毛航宇、刘武龙、郝建业。训练智能体的方法和装置。专利号:202010077714.8。已集成到实际产品。已申请国际专利。
- 毛航宇、张正超、肖臻、倪炎、龚志波。流量调度方法及装置。专利号:201811505121.6。
- 李本超、毛航宇、肖阳、肖臻。一种网络流量监测方法及网络设备。专利号:201710681276.4。
- 肖阳、陈凯、李本超、毛航宇、肖臻。一种多路径流量发送的方法及装置。专利号:201610915269.1。