Papers
2024
Understanding, rehearsing, and introspecting: Learn a policy from textual tutorial books in football games
Xiong-Hui Chen, Ziyan Wang, Yali Du, Shengyi Jiang, Meng Fang, Yang Yu, Jun Wang.
In: Advances in Neural Information Processing Systems 38 (NeurIPS'24), Vancouver, Canada, 2024. (Oral)
Knowledgeable agents by offline reinforcement learning from large language model rollouts
Jing-Cheng Pang, Si-Hang Yang, Kaiyuan Li, Jiaji Zhang, Xiong-Hui Chen, Nan Tang, Yang Yu.
In: Advances in Neural Information Processing Systems 38 (NeurIPS'24), Vancouver, Canada, 2024.
Efficient recurrent off-policy RL requires a context-encoder-specific learning rate
Fan-Ming Luo, Zuolin Tu, Zefang Huang, Yang Yu.
In: Advances in Neural Information Processing Systems 38 (NeurIPS'24), Vancouver, Canada, 2024.
Provably and practically efficient adversarial imitation learning with general function approximation
Tian Xu, Zhilong Zhang, Ruishuo Chen, Yihao Sun, Yang Yu.
In: Advances in Neural Information Processing Systems 38 (NeurIPS'24), Vancouver, Canada, 2024.
Provably and practically efficient adversarial imitation learning with general function approximation.
Tian Xu, Zhilong Zhang, Ruishuo Chen, Yihao Sun, Yang Yu.
In: Advances in Neural Information Processing Systems 38 (NeurIPS'24), Vancouver, Canada, 2024.
Dynamics Adaptive Safe Reinforcement Learning with a Misspecified Simulator
Ruiqi Xue, Ziqian Zhang, Lihe Li, Feng Chen, Yi-Chen Li, Yang Yu, Lei Yuan.
In: Proceedings of the 35th European Conference on Machine Learning (ECML'24), Vilnius, Lithuania, 2024.
Beimingwu: A Learnware dock system
Zhi-Hao Tan, Jian-Dong Liu, Xiao-Dong Bi, Peng Tan, Qin-Cheng Zheng, Hai-Tian Liu, Yi Xie, Xiaochuan Zou, Yang Yu, Zhi-Hua Zhou.
In: Proceedings of the 30th SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'24) Applied Data Science Track, Barcelona, Spain, 2024.
Deep Demonstration Tracing: Learning Generalizable Imitator for Runtime One-Shot Imitation
Xiong-Hui Chen, Junyin Ye, Hang Zhao, Yi-Chen Li, Xu-Hui Liu, Haoran Shi, Yu-Yan Xu, Zhihao Ye, Si-Hang Yang, Yang Yu, Kai Xu, Zongzhang Zhang, Anqi Huang.
In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.
Policy-conditioned Environment Models are More Generalizable
Ruifeng Chen, Xiong-Hui Chen, Yihao Sun, Siyuan Xiao, Minhui Li, Yang Yu.
In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.
Limited Preference Aided Imitation Learning from Imperfect Demonstrations
Xingchen Cao, Fan-Ming Luo, Junyin Ye, Tian Xu, Zhilong Zhang, Yang Yu.
In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, Zhi-Quan Luo.
In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.
Offline Transition Modeling via Contrastive Energy Learning
Ruifeng Chen, Chengxing Jia, Zefang Huang, Tian-Shuo Liu, Xu-Hui Liu, Yang Yu.
In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.
Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning
Xu-Hui Liu, Tian-Shuo Liu, Shengyi Jiang, Ruifeng Chen, Zhilong Zhang, Xinwei Chen, Yang Yu.
In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.
Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics
Xinyu Zhang, Wenjie Qiu, Yi-Chen Li, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu.
In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.
Understanding or manipulation: Rethinking online performance gains of modern recommender systems
Zhengbang Zhu, Rongjun Qin, Junjie Huang, Xinyi Dai, Yang Yu, Yong Yu, Weinan Zhang.
ACM Transactions on Information Systems, 42(4):1-32, 2024, CoRR abs/2210.05662.
Learning in games: A systematic review
Rong-Jun Qin, Yang Yu.
SCIENCE CHINA Information Sciences, 2024.
Ziniu Li, Tian Xu, Yang Yu.
In: Proceedings of the 12th International Conference on Learning Representations (ICLR'24) Tiny Papers, Vienna, Austria, 2024.
Distributional reinforcement learning with sample-set Bellman update
Weijian Zhang, Jianshu Wang, Yang Yu.
In: Proceedings of 2024 IEEE International Conference on Robotics and Automation (ICRA'24), Yokohama, Japan, 2024.
Reward-consistent dynamics models are strongly generalizable for offline reinforcement learning
Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu.
In: Proceedings of the 12th International Conference on Learning Representations (ICLR'24), Vienna, Austria, 2024, (Spotlight) https://arxiv.org/abs/2310.05422.
Policy rehearsing: Training generalizable policies for reinforcement learning
Chengxing Jia, Chenxiao Gao, Hao Yin, Fuxiang Zhang, Xiong-Hui Chen, Tian Xu, Lei Yuan, Zongzhang Zhang, Yang Yu, Zhi-Hua Zhou.
In: Proceedings of the 12th International Conference on Learning Representations (ICLR'24), Vienna, Austria, 2024.
Language model self-improvement by reinforcement learning contemplation
Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu.
In: Proceedings of the 12th International Conference on Learning Representations (ICLR'24), Vienna, Austria, 2024, http://arxiv.org/abs/2305.14483.
Flow to better: Offline preference-based reinforcement learning via preferred trajectory generation
Zhilong Zhang, Yihao Sun, Junyin Ye, Tian-Shuo Liu, Jiaji Zhang, Yang Yu.
In: Proceedings of the 12th International Conference on Learning Representations (ICLR'24), Vienna, Austria, 2024.
Cost-aware offline safe meta reinforcement learning with robust in-distribution online task adaptation
Cong Guan, Ruiqi Xue, Ziqian Zhang, Lihe Li, Yichen Li, Lei Yuan, Yang Yu.
In: Proceedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2024), 2024.
Foresight distribution adjustment for off-policy reinforcement learning
Ruifeng Chen, Xu-Hui Liu, Tian-Shuo Liu, Shengyi Jiang, Feng Xu, Yang Yu.
In: Proceedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2024), 2024.
Disentangling policy from offline task representation learning via adversarial data augmentation
Chengxing Jia, Fuxiang Zhang, Yi-Chen Li, Chenxiao Gao, Xu-Hui Liu, Lei Yuan, Zongzhang Zhang, Yang Yu.
In: Proceedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2024), 2024.
Deep anomaly detection via active anomaly search
Chao Chen, Dawei Wang, Feng Mao, Jiacheng Xu, Zongzhang Zhang, Yang Yu.
In: Proceedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2024), 2024.
Episodic return decomposition by difference of implicitly assigned sub-trajectory reward
Haoxin Lin, Hongqiu Wu, Jiaji Zhang, Yihao Sun, Junyin Ye, Yang Yu.
In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI'24), 2024.
Focus-Then-Decide: Segmentation-assisted reinforcement learning
Chao Chen, Jiacheng Xu, Weijian Liao, Hao Ding, Zongzhang Zhang, Yang Yu, Rui Zhao.
In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI'24), 2024.
ACT: Empowering decision transformer with dynamic programming via advantage conditioning
Chenxiao Gao, Chenyang Wu, Mingjun Cao, Rui Kong, Zongzhang Zhang, Yang Yu.
In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI'24), 2024.
Generalizable task representation learning for offline meta-reinforcement learning with data limitations
Renzhe Zhou, Chenxiao Gao, Zongzhang Zhang, Yang Yu.
In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI'24), 2024.
Model gradient: unified model and policy learning in model-based reinforcement learning
Chengxing Jia, Fuxiang Zhang, Tian Xu, Jing-Cheng Pang, Zongzhang Zhang, Yang Yu.
Frontiers of Computer Science, 18:184339, 2024.
MixLight: Mixed-Agent Cooperative Reinforcement Learning for Traffic Light Control
Ming Yang, Yiming Wang, Yang Yu , Mingliang Zhou, Leong Hou U.
IEEE Transactions on Industrial Informatics, 20(2): 2653-2661, 2024.
2023
Offline model-based adaptable policy learning for decision-making in out-of-support regions
Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Tony Qin, Shang Wenjie, Jieping Ye.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(12): 15260-15274, 2023.
Learning physically realizable skills for online packing of general 3D shapes
Hang Zhao, Zherong Pan, Yang Yu, Kai Xu.
ACM Transactions on Graphics, 42(5): 165:1-165:21, 2023, https://arxiv.org/abs/2212.02094.
Fully decentralized multiagent communication via causal inference
Han Wang, Yang Yu, Yuan Jiang.
IEEE Transactions on Neural Networks and Learning Systems, 34(12): 10193-10202, 2023.
Memory-efficient transformer-based network model for traveling salesman problem
Hua Yang, Minghao Zhao, Lei Yuan, Yang Yu, Zhenhua Li, Ming Gu.
Neural Networks, 161:589-597, 2023.
Learning to coordinate with anyone
Lei Yuan, Lihe Li, Ziqian Zhang, Feng Chen, Tianyi Zhang, Cong Guan, Yang Yu, Zhi-Hua Zhou.
In: Proceedings of the Distributed Artificial Intelligence (DAI'23), 2023, (Best paper award) CoRR abs/2309.12633.
Adversarial counterfactual environment model learning
Xiong-Hui Chen, Yang Yu, Zheng-Mao Zhu, Zhihua Yu, Zhenjun Chen, Chenghe Wang, Yinan Wu, Hongqiu Wu, Rong-Jun Qin, Ruijin Ding, Fangsheng Huang.
In: Advances in Neural Information Processing Systems 36 (NeurIPS'23), New Orleans, LA, 2023, (Spotlight), CoRR abs/2206.04890.
Learning World models with identifiable factorization
Yu-Ren Liu, Biwei Huang, Zheng-Mao Zhu, Honglong Tian, Mingming Gong, Yang Yu, Kun Zhang.
In: Advances in Neural Information Processing Systems 36 (NeurIPS'23), New Orleans, LA, 2023, https://arxiv.org/abs/2306.06561.
Jing-Cheng Pang, Xin-Yu Yang, Si-Hang Yang, Yang Yu.
In: Advances in Neural Information Processing Systems 36 (NeurIPS'23), New Orleans, LA, 2023, CoRR abs/2302.09368.
Imitation learning from imperfection: Theoretical justifications and algorithms
Ziniu Li, Tian Xu, Zeyu Qin, Yang Yu, Zhi-Quan Luo.
In: Advances in Neural Information Processing Systems 36 (NeurIPS'23), New Orleans, LA, 2023, (Spotlight) CoRR abs/2301.11687.
Model-based reinforcement learning with multi-step plan value estimation
Haoxin Lin, Yihao Sun, Jiaji Zhang, Yang Yu.
In: Proceedings of the 26th European Conference on Artificial Intelligence (ECAI'23), Kraków, Poland, 2023, CoRR abs/2209.05530.
Degradation-resistant offline optimization via accumulative risk control
Huakang Lu, Hong Qian, Yupeng Wu, Ziqi Liu, Ya-Lin Zhang, Aimin Zhou, Yang Yu.
In: Proceedings of the 26th European Conference on Artificial Intelligence (ECAI'23), Kraków, Poland, 2023, CoRR abs/2209.05530.
Object-oriented option framework for robotics manipulation in clutter
Jing-Cheng Pang, Si-Hang Yang, Xiong-Hui Chen, Xinyu Yang, Yang Yu, Mas Ma, Ziqi Guo, Howard Yang, Bill Huang.
In: Proceedings of 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'23), 2023.
Internal logical induction for pixel-symbolic reinforcement learning
Jiacheng Xu, Chao Chen, Fuxiang Zhang, Lei Yuan, Zongzhang Zhang, Yang Yu.
In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'23), Long Beach, CA, 2023.
Provably efficient adversarial imitation learning with unknown transitions
Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo.
In: Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence (UAI'23), Pittsburgh, PA, 2023.
Fast teammate adaptation in the presence of sudden policy change
Ziqian Zhang, Lei Yuan, Lihe Li, Ke Xue, Chengxing Jia, Cong Guan, Chao Qian, Yang Yu.
In: Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence (UAI'23), Pittsburgh, PA, 2023.
Model-Bellman inconsistency for model-based offline reinforcement learning
Yihao Sun, Jiaji Zhang, Chengxing Jia, Haoxin Lin, Junyin Ye, Yang Yu.
In: Proceedings of the 40th International Conference on Machine Learning (ICML'23), Honolulu, HA, 2023.
Policy regularization with dataset constraint for offline reinforcement learning
Yuhang Ran, Yi-Chen Li, Fuxiang Zhang, Zongzhang Zhang, Yang Yu.
In: Proceedings of the 40th International Conference on Machine Learning (ICML'23), Honolulu, HA, 2023.
AliExpress learning-to-rank: Maximizing online model performance without going online
Guangda Huzhang, Zhen-Jia Pang, Yongqing Gao, Wen-Ji Zhou, Qing Da, Anxiang Zeng, Yang Yu, and Zhi-Hua Zhou.
IEEE Transactions on Knowledge and Data Engineering, 35(2): 1214-1226, 2023. CoRR abs/2003.11941.
Xiong-Hui Chen, Bowei He, Yang Yu, Qingyang Li, Zhiwei (Tony) Qin, Wenjie Shang, Jieping Ye, Chen Ma.
In: Proceedings of the 39th IEEE International Conference on Data Engineering (ICDE'23), 2023.
Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data
Fuxiang Zhang, Chengxing Jia, Yi-Chen Li, Lei Yuan, Yang Yu, Zongzhang Zhang.
In: Proceedings of the 11th International Conference on Learning Representations (ICLR'23), Kigali, Rwanda, 2023.
How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement](https://dl.acm.org/doi/abs/10.5555/3545946.3598773)
Xuhui Liu, Feng Xu, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Ruifeng Chen, Zongzhang Zhang and Yang Yu.
In: Proceedings of the 22th International nference on Autonomous Agents and MultiAgent Systems (AAMAS'23), 2023.
Self-Motivated Multi-Agent Exploration
Shaowei Zhang, Jiahan Cao, Lei Yuan, Yang Yu and De-Chuan Zhan.
In: Proceedings of the 22th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS'23), 2023.
Robust multi-agent coordination via evolutionary generation of auxiliary adversarial attackers
Lei Yuan, Zi-Qian Zhang, Ke Xue, Hao Yin, Feng Chen, Cong Guan, Li-He Li, Chao Qian, Yang Yu.
In: Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI'23), 2023.
Policy-independent behavioral metric-based representation for deep reinforcement learning
Wei-Jian Liao, Zongzhang Zhang, Yang Yu.
In: Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI'23), 2023.
2022
Multi-agent policy transfer via task relationship modeling
Rongjun Qin, Feng Chen, Tonghan Wang, Lei Yuan, Xiaoran Wu, Yipeng Kang, Zongzhang Zhang, Chongjie Zhang, Yang Yu.
In: NeurIPS'22 Workshop on Deep RL, 2022.
On efficient reinforcement learning for full-length game of StarCraft II
Ruo-Ze Liu, Zhen-Jia Pang, Zhou-Yu Meng, Wenhai Wang, Yang Yu, Tong Lu.
Journal of Artificial Intelligence Research, 75:213-260 , 2022.
Cascaded algorithm selection with extreme-region UCB bandit
Yi-Qi Hu, Xu-Hui Liu, Shu-Qiao Li, Yang Yu.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):6782-6794, 2022.
Error bounds of imitating policies and environments for reinforcement learning
Tian Xu, Ziniu Li, Yang Yu.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):6968-6980, 2022.
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
Rong-Jun Qin, Songyi Gao, Xingyuan Zhang, Zhen Xu, Shengkai Huang, Zewen Li, Weinan Zhang, Yang Yu.
In: Advances in Neural Information Processing Systems 35 (NeurIPS'22, Datasets and Benchmarks), New Orleans, LA, 2022. CoRR abs/2102.00714.
Efficient Multi-agent Communication via Self-supervised Information Aggregation
Cong Guan, Feng Chen, Lei Yuan, Chenghe Wang, Hao Yin, Zongzhang Zhang, Yang Yu.
In: Advances in Neural Information Processing Systems 35 (NeurIPS'22), New Orleans, LA, 2022.
Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning
Chenyang Wu, Tianci Li, Zongzhang Zhang, Yang Yu.
In: Advances in Neural Information Processing Systems 35 (NeurIPS'22), New Orleans, LA, 2022.
Multi-agent Dynamic Algorithm Configuration
Ke Xue, Jiacheng Xu, Lei Yuan, Miqing Li, Chao Qian, Zongzhang Zhang, Yang Yu.
In: Advances in Neural Information Processing Systems 35 (NeurIPS'22), New Orleans, LA, 2022.
Efficient reinforcement learning for StarCraft by abstract forward models and transfer learning
Ruo-Ze Liu, Haifeng Guo, Xiaozhong Ji, Yang Yu, Zhen-Jia Pang, Zitai Xiao, Yuzhou Wu, Tong Lu.
IEEE Transactions on Games, 14(2): 294-307, 2022.
The teaching dimension of regularized kernel learners
Hong Qian, Xu-Hui Liu, Chen-Xi Su, Aimin Zhou, Yang Yu.
In: Proceedings of the 39th International Conference on Machine Learning (ICML'22), 2022.
Efficient multi-agent communication via shapley message value
Di Xue, Lei Yuan, Zongzhang Zhang, Yang Yu.
In: Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI'22), Virtual Conference, 2022.
Multi-agent concentrative coordination with decentralized task representation
Lei Yuan, Chenghe Wang, Jianhao Wang, Fuxiang Zhang, Feng Chen, Cong Guan, Zongzhang Zhang, Chongjie Zhang, Yang Yu.
In: Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI'22), Virtual Conference, 2022.
Rethinking ValueDice: Does It Really Improve Performance?
Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo.
In: Blog Track at 10th International Conference on Learning Representations (ICLR'22 Blog Track), 2022, CoRR abs/2202.02468.
Context-aware sparse deep coordination graphs
Tonghan Wang, Liang Zeng, Weijun Dong, Qianlan Yang, Yang Yu, Chongjie Zhang.
In: Proceedings of the 10th International Conference on Learning Representations (ICLR'22), Virtual Conference, 2022.
Active hierarchical exploration with stable subgoal representation learning
Siyuan Li, Jin Zhang, Jianhao Wang, Yang Yu, Chongjie Zhang.
In: Proceedings of the 10th International Conference on Learning Representations (ICLR'22), Virtual Conference, 2022.
Learning efficient online 3D bin packing on packing configuration trees
Hang Zhao, Yang Yu, Kai Xu.
In: Proceedings of the 10th International Conference on Learning Representations (ICLR'22), Virtual Conference, 2022.
Improve generated adversarial imitation learning with reward variance regularization
Yi-Feng Zhang, Fan-Ming Luo, Yang Yu.
Machine Learning, 2022.
Adapt to environment sudden changes by learning context sensitive policy
Fan-Ming Luo, Shengyi Jiang, Yang Yu, Zongzhang Zhang, Yi-Feng Zhang.
In: Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI'22), Virtual Conference, 2022.
Invariant action effect model for reinforcement learning
Zheng-Mao Zhu, Shengyi Jiang, Yu-Ren Liu, Yang Yu, Kun Zhang.
In: Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI'22), Virtual Conference, 2022.
Multi-agent incentive communication via decentralized teammate modeling
Lei Yuan, Jianhao Wang, Fuxiang Zhang, Chenghe Wang, Zongzhang Zhang, Yang Yu, Chongjie Zhang.
In: Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI'22), Virtual Conference, 2022.
ZOOpt: Toolbox for derivative-free optimization
Yu-Ren Liu, Yi-Qi Hu, Hong Qian, Yang Yu, and Chao Qian.
SCIENCE CHINA Information Sciences, 65: 207101, 2022. CoRR abs/1801.00329.
2021
More efficient adversarial imitation learning algorithms with known and unknown transitions
Tian Xu, Ziniu Li, Yang Yu.
In: Ecological Theory of RL Workshop in NeurIPS 2021.
Offline model-based adaptable policy learning
Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Tony Qin, Shang Wenjie, Jieping Ye.
In: Advances in Neural Information Processing Systems 34 (NeurIPS'21), Virtual Conference, 2021.
Regret minimization experience replay in off-policy reinforcement learning
Xu-Hui Liu, Zhenghai Xue, Jing-Cheng Pang, Shengyi Jiang, Feng Xu, Yang Yu.
In: Advances in Neural Information Processing Systems 34 (NeurIPS'21), Virtual Conference, 2021.
Cross-modal domain adaptation for cost-efficient visual reinforcement learning
Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang, Yang Yu.
In: Advances in Neural Information Processing Systems 34 (NeurIPS'21), Virtual Conference, 2021.
Adaptive online packing-guided search for POMDPs
Chenyang Wu, Guoyu Yang, Zongzhang Zhang, Yang Yu, Dong Li, Wulong Liu, Jianye Hao.
In: Advances in Neural Information Processing Systems 34 (NeurIPS'21), Virtual Conference, 2021.
Fast Pareto optimization for subset selection with dynamic cost constraints
Chao Bian, Chao Qian, Frank Neumann, and Yang Yu.
In: Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI'21), Virtual Conference, 2021.
QPLEX: Duplex dueling multi-agent Q-Learning
Jianhao Wang, Zhizhou Ren, Terry Liu, Yang Yu, and Chongjie Zhang.
In: Proceedings of the 9th International Conference on Learning Representations (ICLR'21), Virtual Conference, 2021.
Sequential and dynamic constraint contrastive learning for reinforcement learning
Weijie Shen, Lei Yuan, Junfu Huang, Songyi Gao, Yuyang Huang, Yang Yu.
In: Proceedings of the IEEE 2021 International Joint Conference on Neural Networks (IJCNN'21), Shenzhen, China, 2021.
Wenjie Shang, Qingyang Li, Zhiwei Qin, Yang Yu, Yiping Meng, Jieping Ye.
Machine Learning, 110(9): 2603-2640, 2021.
Improving Search Engine Efficiency through Contextual Factor Selection
Anxiang Zeng, Han Yu, Qing Da, Yusen Zhan, Yang Yu, Jingren Zhou, Chunyan Miao.
AI Magazine, 42(2): 50-58, 2021.
Derivative-free reinforcement learning: A review
Hong Qian, Yang Yu.
Frontiers of Computer Science, 15(6): 156336, 2022. CoRR abs/2102.05710.
Machine learning steered symbolic execution framework for complex software code
Lei Bu, Yongjuan Liang, Zhunyi Xie, Hong Qian, Yi-Qi Hu, Yang Yu, Xin Chen, Xuandong Li.
Formal Aspects of Computing, 33(3): 301-323, 2021.
Analysis of Noisy Evolutionary Optimization When Sampling Fails
Chao Qian, Chao Bian, Yang Yu, Ke Tang, and Xin Yao.
Algorithmica, 83(4): 940-975, 2021.
On the robustness of median sampling in noisy evolutionary optimization
Chao Bian, Chao Qian, Yang Yu, Ke Tang.
Science China Information Sciences, 64(5), 2021.
2020
Error bounds of imitating policies and environments
Tian Xu, Ziniu Li, Yang Yu.
In: Advances in Neural Information Processing Systems 33 (NeurIPS'20), Virtual Conference, 2020. (PDF).
Offline imitation learning with a misspecified simulator
Shengyi Jiang, Jing-Cheng Pang, Yang Yu.
In: Advances in Neural Information Processing Systems 33 (NeurIPS'20), Virtual Conference, 2020. (PDF).
Running time analysis of the (1+1)-EA for robust linear optimization
Chao Bian, Chao Qian, Ke Tang, Yang Yu.
Theoretical Computer Science, 843: 57-72, 2020.
A technical view on neural architecture search
Yi-Qi Hu and Yang Yu.
International Journal on Machine Learning and Cybernetics, 11(4): 795-811, 2020.
Reinforcement learning with action-specific focuses in video games
Meng Wang, Yingfeng Chen, Tangjie Lv, Yan Song, Kai Guan, Changjie Fan, Yang Yu.
In: Proceedings of IEEE 2020 Conference on Games (CoG'20), Osaka, Japan, 2020, pp.9-16.
Derivative-free optimization with adaptive experience for efficient hyper-parameter tuning
Yi-Qi Hu, Zelin Liu, Hua Yang, Yang Yu, and Yunfeng Liu.
In: Proceedings of the 24th European Conference on Artificial Intelligence (ECAI'20), Santiago de Compostela, Spain, 2020. (PDF).
An efficient evolutionary algorithm for subset selection with general cost constraints
Chao Bian, Chao Feng, Chao Qian, and Yang Yu.
In: Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI'20), New York, NY, 2020. (PDF).
Enhancing neural mathematical reasoning by abductive combination with symbolic library
Yangyang Hu, Yang Yu.
In: ICML 2020 Workshop on Bridge Between Perception and Reasoning: Graph Neural Networks & Beyond, Vienna, Austria, 2020.