I am currently a principal researcher (资深算法研究员) at SenseTime. Before that, I was a principal researcher (主任研究员) at Huawei Noah's Ark Lab. I focus on bringing the intellegent decision-making techniques to the real-world. I am specifically interested in Reinforcement Learning (RL) and Multi-Agent Reinforcement Learning (MARL). I serve as the (senior) program-committee-member of top AI conferences like NeurIPS/ICML and AAAI/IJCAI.
Education Experience
![]()
Ph.D. @ PKU
|
|
![]()
B.S. @ NUAA
|
|
Selected Work Experience
![]()
Principal Researcher (资深算法研究员)
|
|
![]() ![]()
Senior, then promoted to Principal Researcher (高级,主任研究员)
|
|
![]() ![]()
Research Intern
|
|
![]()
School-Enterprise Cooperation
|
|
Selected Award
Selected Publication
- Google Scholar: https://scholar.google.com/citations?user=EtVHsgcAAAAJ
- Jianye Hao, Xiaotian Hao, Hangyu Mao, Weixun Wang, Yaodong Yang, Dong Li, Yan Zheng, and Zhen Wang. Breaking the Curse of Dimensionality in Multiagent State Space: A Unified Agent Permutation Framework. Submit to ICLR 2023 (深度学习领域国际顶会, Rating: 6/6/6/8).
- Ming Yan, Junjie Chen*, Hangyu Mao*, Jiajun Jiang, Jianye Hao, Xingjian Li, Zhao Tian, Zhichao Chen, Dong Li, Zhangkong Xian, Yanwei Guo, Wulong Liu, Bin Wang, Yuefeng Sun, and Yongshun Cui. Achieving Last-Mile Functional Coverage in Testing Chip Design Software Implementations. ICSE 2023 (CCF-A).
- Xianjie Zhang, Yu Liu, Hangyu Mao, and Chao Yu. Common Belief Multi-Agent Reinforcement Learning Based on Variational Recurrent Models. Neurocomputing 2022 (CCF-C).
- Wenhan Huang, Kai Li, Kun Shao, Tianze Zhou, Matthew E. Taylor, Jun Luo, Dongge Wang, Hangyu Mao, Jianye Hao, Jun Wang, and Xiaotie Deng. Multiagent Q-learning with Sub-Team Coordination. NeurIPS 2022 (CCF-A).
- Lichen Pan, Jun Qian, Wei Xia, Hangyu Mao, Jun Yao, PengZe Li, and Zhen Xiao. Optimizing Communication in Deep Reinforcement Learning with XingTian. Middleware 2022 (CCF-B).
- Jinpeng Li, Guangyong Chen, Hangyu Mao, Danruo Deng, Dong Li, Jianye Hao, Qi Dou, and Pheng-Ann Heng. Flat-aware Cross-stage Distilled Framework for Imbalanced Medical Image Classification. MICCAI 2022 (CCF-B). Provisional Accept Recommendation (Top 13%).
- Mingzhe Xing, Hangyu Mao, and Zhen Xiao. Fast and Fine-grained Autoscaler for Streaming Jobs with Reinforcement Learning. IJCAI 2022 (CCF-A).
- Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, and Li Wang. What About Inputing Policy in Value Function: Policy Representation and Policy-Extended Value Function Approximator. AAAI 2022 (CCF-A).
- Wenhan Huang, Kai Li, Kun Shao, Tianze Zhou, Jun Luo, Dongge Wang, Hangyu Mao, Jianye Hao, Jun Wang, and Xiaotie Deng. Multiagent Q-learning with Sub-Team Coordination. Extended Abstract at AAMAS 2022 (CCF-B).
- Hangyu Mao*, Chao Wang*, Xiaotian Hao*, Yihuan Mao*, Yiming Lu*, Chengjie Wu*, Jianye Hao, Dong Li, and Pingzhong Tang. SEIHAI: A Sample-Efficient Hierarchical AI for the MineRL Competition. DAI 2021 (多智能体领域中国顶会). Champion Solution for NeurIPS20 MineRL Competition (Top 1 among 90+ teams).
- Tianpei Yang*, Weixun Wang*, Hongyao Tang*, Jianye Hao, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Chengwei Zhang, Yujing Hu, Yingfeng Chen, and Changjie Fan. An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning. NeurIPS 2021 (CCF-A).
- Xianjie Zhang, Yu Liu, Xiujuan Xu, Qiong Huang, Hangyu Mao, and Anil Carie. Structural Relational Inference Actor-Critic for Multi-Agent Reinforcement Learning. Neurocomputing 2021 (CCF-C).
- Changmin Yu, Dong Li, Hangyu Mao, Jianye Hao, and Neil Burgess. Learning State Representations via Temporal Cycle-Consistency Constraint in Model-Based Reinforcement Learning. SSL-RL Workshop at ICLR 2021 (深度学习领域国际顶会).
- Guss William Hebgen, ..., Hangyu Mao, ..., et al. Towards Robust and Domain Agnostic Reinforcement Learning Competitions: MineRL 2020. NeurIPS 2020 Competition and Demonstration Track (CCF-A).
- Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong, and Yan Ni. Learning Multi-Agent Communication with Double Attentional Deep Reinforcement Learning. JAAMAS 2020 (CCF-B).
- Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, and Zhen Xiao. Neighborhood Cognition Consistent Multi-Agent Reinforcement Learning. AAAI 2020 (CCF-A). Long Oral Presentation (Top 5%).
- Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong, and Yan Ni. Learning Agent Communication under Limited Bandwidth by Message Pruning. AAAI 2020 (CCF-A).
- Hangyu Mao, Zhengchao Zhang, Zhen Xiao, and Zhibo Gong. Modelling the Dynamic Joint Policy of Teammates with Attention Multi-Agent DDPG. AAMAS 2019 (CCF-B).
- Hangyu Mao, Yang Xiao, Yuan Wang, Jiakang Wang, and Zhen Xiao. Topic-Specific Retweet Count Ranking for Weibo. PAKDD 2018 (CCF-C).
- Yuan Wang, Hangyu Mao, and Zhen Xiao. Identifying Influential Users’ Professions via the Microblogs They Forward. SocInf Workshop at IJCAI 2017 (CCF-A).
- Yang Xiao, Yuan Wang, Hangyu Mao, and Zhen Xiao. Predicting Restaurant Consumption Level through Social Media Footprints. COLING 2016 (CCF-B).
- Yiqun Chen, Hangyu Mao, Tianle Zhang, Shiguang Wu, Bin Zhang, Jianye Hao, Dong Li, Bin Wang, and Hongxing Chang. PTDE: Personalized Training with Distillated Execution for Multi-Agent Reinforcement Learning. Arxiv 2022.
- Xiaotian Hao, Hangyu Mao, Weixun Wang, Yaodong Yang, Dong Li, Yan Zheng, Zhen Wang, and Jianye Hao. Breaking the Curse of Dimensionality in Multiagent State Space: A Unified Agent Permutation Framework. Arxiv 2022.
- Hangyu Mao, Zhibo Gong, and Zhen Xiao. Reward Design in Cooperative Multi-Agent Reinforcement Learning for Packet Routing. Arxiv 2020.
- Hangyu Mao, Zhibo Gong, Zhengchao Zhang, Zhen Xiao, and Yan Ni. Learning Multi-Agent Communication under Limited-Bandwidth Restriction for Internet Packet Routing. Arxiv 2019.
- Hangyu Mao, Zhibo Gong, Yan Ni, and Zhen Xiao. Actor-Coordinator-Critic Net for "Learning-to-Communicate" with Deep Multi-agent Reinforcement Learning. Arxiv 2017.
- Hangyu Mao, Wulong Liu, and Jianye Hao. Agent Training Method, Apparatus, and Computer-readable Storage Medium. U.S. Application No. 17877063, issued on Nov. 17, 2022. US Patent.
- 毛航宇、郭艳伟、冼章孔。确定芯片测试的输入的方法和装置。专利号:202210626411.6。已集成到实际产品。
- 陈俊洁、毛航宇、郝建业、孙月凤、姜佳君。一种芯片的测试用例生成方法、装置及存储介质。专利号:202111663515.6。已集成到实际产品。2022年4月被评为华为“潜在高价值”专利。
- 张正超、肖臻、毛航宇、潘丽晨。一种基于深度强化学习的集群资源管理和任务调度方法及系统。专利号:202010581407.3。
- 潘丽晨、毛航宇、肖臻、张正超。基于多智能体深度强化学习的集群资源调度方法及系统。专利号:202010322543.0。
- 毛航宇、刘武龙、郝建业。训练智能体的方法和装置。专利号:202010077714.8。已集成到实际产品。已申请国际专利。
- 毛航宇、张正超、肖臻、倪炎、龚志波。流量调度方法及装置。专利号:201811505121.6。
- 李本超、毛航宇、肖阳、肖臻。一种网络流量监测方法及网络设备。专利号:201710681276.4。
- 肖阳、陈凯、李本超、毛航宇、肖臻。一种多路径流量发送的方法及装置。专利号:201610915269.1。