Skip to content

Papers

2024

Understanding, rehearsing, and introspecting: Learn a policy from textual tutorial books in football games

Xiong-Hui Chen, Ziyan Wang, Yali Du, Shengyi Jiang, Meng Fang, Yang Yu, Jun Wang.

In: Advances in Neural Information Processing Systems 38 (NeurIPS'24), Vancouver, Canada, 2024. (Oral)

Knowledgeable agents by offline reinforcement learning from large language model rollouts

Jing-Cheng Pang, Si-Hang Yang, Kaiyuan Li, Jiaji Zhang, Xiong-Hui Chen, Nan Tang, Yang Yu.

In: Advances in Neural Information Processing Systems 38 (NeurIPS'24), Vancouver, Canada, 2024.

Efficient recurrent off-policy RL requires a context-encoder-specific learning rate

Fan-Ming Luo, Zuolin Tu, Zefang Huang, Yang Yu.

In: Advances in Neural Information Processing Systems 38 (NeurIPS'24), Vancouver, Canada, 2024.

Provably and practically efficient adversarial imitation learning with general function approximation

Tian Xu, Zhilong Zhang, Ruishuo Chen, Yihao Sun, Yang Yu.

In: Advances in Neural Information Processing Systems 38 (NeurIPS'24), Vancouver, Canada, 2024.

Provably and practically efficient adversarial imitation learning with general function approximation.

Tian Xu, Zhilong Zhang, Ruishuo Chen, Yihao Sun, Yang Yu.

In: Advances in Neural Information Processing Systems 38 (NeurIPS'24), Vancouver, Canada, 2024.

Dynamics Adaptive Safe Reinforcement Learning with a Misspecified Simulator

Ruiqi Xue, Ziqian Zhang, Lihe Li, Feng Chen, Yi-Chen Li, Yang Yu, Lei Yuan.

In: Proceedings of the 35th European Conference on Machine Learning (ECML'24), Vilnius, Lithuania, 2024.

Beimingwu: A Learnware dock system

Zhi-Hao Tan, Jian-Dong Liu, Xiao-Dong Bi, Peng Tan, Qin-Cheng Zheng, Hai-Tian Liu, Yi Xie, Xiaochuan Zou, Yang Yu, Zhi-Hua Zhou.

In: Proceedings of the 30th SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'24) Applied Data Science Track, Barcelona, Spain, 2024.

Deep Demonstration Tracing: Learning Generalizable Imitator for Runtime One-Shot Imitation

Xiong-Hui Chen, Junyin Ye, Hang Zhao, Yi-Chen Li, Xu-Hui Liu, Haoran Shi, Yu-Yan Xu, Zhihao Ye, Si-Hang Yang, Yang Yu, Kai Xu, Zongzhang Zhang, Anqi Huang.

In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.

Policy-conditioned Environment Models are More Generalizable

Ruifeng Chen, Xiong-Hui Chen, Yihao Sun, Siyuan Xiao, Minhui Li, Yang Yu.

In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.

Limited Preference Aided Imitation Learning from Imperfect Demonstrations

Xingchen Cao, Fan-Ming Luo, Junyin Ye, Tian Xu, Zhilong Zhang, Yang Yu.

In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.

ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models

Ziniu Li, Tian Xu, Yushun Zhang, Zhihang Lin, Yang Yu, Ruoyu Sun, Zhi-Quan Luo.

In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.

Offline Transition Modeling via Contrastive Energy Learning

Ruifeng Chen, Chengxing Jia, Zefang Huang, Tian-Shuo Liu, Xu-Hui Liu, Yang Yu.

In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.

Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning

Xu-Hui Liu, Tian-Shuo Liu, Shengyi Jiang, Ruifeng Chen, Zhilong Zhang, Xinwei Chen, Yang Yu.

In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.

Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics

Xinyu Zhang, Wenjie Qiu, Yi-Chen Li, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu.

In: Proceedings of the 40th International Conference on Machine Learning (ICML'24), Vienna, Austria, 2024.

Understanding or manipulation: Rethinking online performance gains of modern recommender systems

Zhengbang Zhu, Rongjun Qin, Junjie Huang, Xinyi Dai, Yang Yu, Yong Yu, Weinan Zhang.

ACM Transactions on Information Systems, 42(4):1-32, 2024, CoRR abs/2210.05662.

Learning in games: A systematic review

Rong-Jun Qin, Yang Yu.

SCIENCE CHINA Information Sciences, 2024.

When is RL better than DPO in RLHF? A representation and optimization PerspectivePolicy optimization in RLHF: The impact of out-of-preference data

Ziniu Li, Tian Xu, Yang Yu.

In: Proceedings of the 12th International Conference on Learning Representations (ICLR'24) Tiny Papers, Vienna, Austria, 2024.

Distributional reinforcement learning with sample-set Bellman update

Weijian Zhang, Jianshu Wang, Yang Yu.

In: Proceedings of 2024 IEEE International Conference on Robotics and Automation (ICRA'24), Yokohama, Japan, 2024.

Reward-consistent dynamics models are strongly generalizable for offline reinforcement learning

Fan-Ming Luo, Tian Xu, Xingchen Cao, Yang Yu.

In: Proceedings of the 12th International Conference on Learning Representations (ICLR'24), Vienna, Austria, 2024, (Spotlight) https://arxiv.org/abs/2310.05422.

Policy rehearsing: Training generalizable policies for reinforcement learning

Chengxing Jia, Chenxiao Gao, Hao Yin, Fuxiang Zhang, Xiong-Hui Chen, Tian Xu, Lei Yuan, Zongzhang Zhang, Yang Yu, Zhi-Hua Zhou.

In: Proceedings of the 12th International Conference on Learning Representations (ICLR'24), Vienna, Austria, 2024.

Language model self-improvement by reinforcement learning contemplation

Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu.

In: Proceedings of the 12th International Conference on Learning Representations (ICLR'24), Vienna, Austria, 2024, http://arxiv.org/abs/2305.14483.

Flow to better: Offline preference-based reinforcement learning via preferred trajectory generation

Zhilong Zhang, Yihao Sun, Junyin Ye, Tian-Shuo Liu, Jiaji Zhang, Yang Yu.

In: Proceedings of the 12th International Conference on Learning Representations (ICLR'24), Vienna, Austria, 2024.

Cost-aware offline safe meta reinforcement learning with robust in-distribution online task adaptation

Cong Guan, Ruiqi Xue, Ziqian Zhang, Lihe Li, Yichen Li, Lei Yuan, Yang Yu.

In: Proceedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2024), 2024.

Foresight distribution adjustment for off-policy reinforcement learning

Ruifeng Chen, Xu-Hui Liu, Tian-Shuo Liu, Shengyi Jiang, Feng Xu, Yang Yu.

In: Proceedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2024), 2024.

Disentangling policy from offline task representation learning via adversarial data augmentation

Chengxing Jia, Fuxiang Zhang, Yi-Chen Li, Chenxiao Gao, Xu-Hui Liu, Lei Yuan, Zongzhang Zhang, Yang Yu.

In: Proceedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2024), 2024.

Deep anomaly detection via active anomaly search

Chao Chen, Dawei Wang, Feng Mao, Jiacheng Xu, Zongzhang Zhang, Yang Yu.

In: Proceedings of the 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS 2024), 2024.

Episodic return decomposition by difference of implicitly assigned sub-trajectory reward

Haoxin Lin, Hongqiu Wu, Jiaji Zhang, Yihao Sun, Junyin Ye, Yang Yu.

In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI'24), 2024.

Focus-Then-Decide: Segmentation-assisted reinforcement learning

Chao Chen, Jiacheng Xu, Weijian Liao, Hao Ding, Zongzhang Zhang, Yang Yu, Rui Zhao.

In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI'24), 2024.

ACT: Empowering decision transformer with dynamic programming via advantage conditioning

Chenxiao Gao, Chenyang Wu, Mingjun Cao, Rui Kong, Zongzhang Zhang, Yang Yu.

In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI'24), 2024.

Generalizable task representation learning for offline meta-reinforcement learning with data limitations

Renzhe Zhou, Chenxiao Gao, Zongzhang Zhang, Yang Yu.

In: Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI'24), 2024.

Model gradient: unified model and policy learning in model-based reinforcement learning

Chengxing Jia, Fuxiang Zhang, Tian Xu, Jing-Cheng Pang, Zongzhang Zhang, Yang Yu.

Frontiers of Computer Science, 18:184339, 2024.

MixLight: Mixed-Agent Cooperative Reinforcement Learning for Traffic Light Control

Ming Yang, Yiming Wang, Yang Yu , Mingliang Zhou, Leong Hou U.

IEEE Transactions on Industrial Informatics, 20(2): 2653-2661, 2024.

2023

Offline model-based adaptable policy learning for decision-making in out-of-support regions

Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Tony Qin, Shang Wenjie, Jieping Ye.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(12): 15260-15274, 2023.

Learning physically realizable skills for online packing of general 3D shapes

Hang Zhao, Zherong Pan, Yang Yu, Kai Xu.

ACM Transactions on Graphics, 42(5): 165:1-165:21, 2023, https://arxiv.org/abs/2212.02094.

Fully decentralized multiagent communication via causal inference

Han Wang, Yang Yu, Yuan Jiang.

IEEE Transactions on Neural Networks and Learning Systems, 34(12): 10193-10202, 2023.

Memory-efficient transformer-based network model for traveling salesman problem

Hua Yang, Minghao Zhao, Lei Yuan, Yang Yu, Zhenhua Li, Ming Gu.

Neural Networks, 161:589-597, 2023.

Learning to coordinate with anyone

Lei Yuan, Lihe Li, Ziqian Zhang, Feng Chen, Tianyi Zhang, Cong Guan, Yang Yu, Zhi-Hua Zhou.

In: Proceedings of the Distributed Artificial Intelligence (DAI'23), 2023, (Best paper award) CoRR abs/2309.12633.

Adversarial counterfactual environment model learning

Xiong-Hui Chen, Yang Yu, Zheng-Mao Zhu, Zhihua Yu, Zhenjun Chen, Chenghe Wang, Yinan Wu, Hongqiu Wu, Rong-Jun Qin, Ruijin Ding, Fangsheng Huang.

In: Advances in Neural Information Processing Systems 36 (NeurIPS'23), New Orleans, LA, 2023, (Spotlight), CoRR abs/2206.04890.

Learning World models with identifiable factorization

Yu-Ren Liu, Biwei Huang, Zheng-Mao Zhu, Honglong Tian, Mingming Gong, Yang Yu, Kun Zhang.

In: Advances in Neural Information Processing Systems 36 (NeurIPS'23), New Orleans, LA, 2023, https://arxiv.org/abs/2306.06561.

Natural language-conditioned reinforcement learning with inside-out task language development and translation

Jing-Cheng Pang, Xin-Yu Yang, Si-Hang Yang, Yang Yu.

In: Advances in Neural Information Processing Systems 36 (NeurIPS'23), New Orleans, LA, 2023, CoRR abs/2302.09368.

Imitation learning from imperfection: Theoretical justifications and algorithms

Ziniu Li, Tian Xu, Zeyu Qin, Yang Yu, Zhi-Quan Luo.

In: Advances in Neural Information Processing Systems 36 (NeurIPS'23), New Orleans, LA, 2023, (Spotlight) CoRR abs/2301.11687.

Model-based reinforcement learning with multi-step plan value estimation

Haoxin Lin, Yihao Sun, Jiaji Zhang, Yang Yu.

In: Proceedings of the 26th European Conference on Artificial Intelligence (ECAI'23), Kraków, Poland, 2023, CoRR abs/2209.05530.

Degradation-resistant offline optimization via accumulative risk control

Huakang Lu, Hong Qian, Yupeng Wu, Ziqi Liu, Ya-Lin Zhang, Aimin Zhou, Yang Yu.

In: Proceedings of the 26th European Conference on Artificial Intelligence (ECAI'23), Kraków, Poland, 2023, CoRR abs/2209.05530.

Object-oriented option framework for robotics manipulation in clutter

Jing-Cheng Pang, Si-Hang Yang, Xiong-Hui Chen, Xinyu Yang, Yang Yu, Mas Ma, Ziqi Guo, Howard Yang, Bill Huang.

In: Proceedings of 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'23), 2023.

Internal logical induction for pixel-symbolic reinforcement learning

Jiacheng Xu, Chao Chen, Fuxiang Zhang, Lei Yuan, Zongzhang Zhang, Yang Yu.

In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'23), Long Beach, CA, 2023.

Provably efficient adversarial imitation learning with unknown transitions

Tian Xu, Ziniu Li, Yang Yu, Zhi-Quan Luo.

In: Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence (UAI'23), Pittsburgh, PA, 2023.

Fast teammate adaptation in the presence of sudden policy change

Ziqian Zhang, Lei Yuan, Lihe Li, Ke Xue, Chengxing Jia, Cong Guan, Chao Qian, Yang Yu.

In: Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence (UAI'23), Pittsburgh, PA, 2023.

Model-Bellman inconsistency for model-based offline reinforcement learning

Yihao Sun, Jiaji Zhang, Chengxing Jia, Haoxin Lin, Junyin Ye, Yang Yu.

In: Proceedings of the 40th International Conference on Machine Learning (ICML'23), Honolulu, HA, 2023.

Policy regularization with dataset constraint for offline reinforcement learning

Yuhang Ran, Yi-Chen Li, Fuxiang Zhang, Zongzhang Zhang, Yang Yu.

In: Proceedings of the 40th International Conference on Machine Learning (ICML'23), Honolulu, HA, 2023.

AliExpress learning-to-rank: Maximizing online model performance without going online

Guangda Huzhang, Zhen-Jia Pang, Yongqing Gao, Wen-Ji Zhou, Qing Da, Anxiang Zeng, Yang Yu, and Zhi-Hua Zhou.

IEEE Transactions on Knowledge and Data Engineering, 35(2): 1214-1226, 2023. CoRR abs/2003.11941.

Sim2Rec: A simulator-based decision-making approach to optimize real-world long-term user engagement in sequential recommender systems

Xiong-Hui Chen, Bowei He, Yang Yu, Qingyang Li, Zhiwei (Tony) Qin, Wenjie Shang, Jieping Ye, Chen Ma.

In: Proceedings of the 39th IEEE International Conference on Data Engineering (ICDE'23), 2023.

Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data

Fuxiang Zhang, Chengxing Jia, Yi-Chen Li, Lei Yuan, Yang Yu, Zongzhang Zhang.

In: Proceedings of the 11th International Conference on Learning Representations (ICLR'23), Kigali, Rwanda, 2023.

How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement](https://dl.acm.org/doi/abs/10.5555/3545946.3598773)

Xuhui Liu, Feng Xu, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Ruifeng Chen, Zongzhang Zhang and Yang Yu.

In: Proceedings of the 22th International nference on Autonomous Agents and MultiAgent Systems (AAMAS'23), 2023.

Self-Motivated Multi-Agent Exploration

Shaowei Zhang, Jiahan Cao, Lei Yuan, Yang Yu and De-Chuan Zhan.

In: Proceedings of the 22th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS'23), 2023.

Robust multi-agent coordination via evolutionary generation of auxiliary adversarial attackers

Lei Yuan, Zi-Qian Zhang, Ke Xue, Hao Yin, Feng Chen, Cong Guan, Li-He Li, Chao Qian, Yang Yu.

In: Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI'23), 2023.

Policy-independent behavioral metric-based representation for deep reinforcement learning

Wei-Jian Liao, Zongzhang Zhang, Yang Yu.

In: Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI'23), 2023.

2022

Multi-agent policy transfer via task relationship modeling

Rongjun Qin, Feng Chen, Tonghan Wang, Lei Yuan, Xiaoran Wu, Yipeng Kang, Zongzhang Zhang, Chongjie Zhang, Yang Yu.

In: NeurIPS'22 Workshop on Deep RL, 2022.

On efficient reinforcement learning for full-length game of StarCraft II

Ruo-Ze Liu, Zhen-Jia Pang, Zhou-Yu Meng, Wenhai Wang, Yang Yu, Tong Lu.

Journal of Artificial Intelligence Research, 75:213-260 , 2022.

Cascaded algorithm selection with extreme-region UCB bandit

Yi-Qi Hu, Xu-Hui Liu, Shu-Qiao Li, Yang Yu.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):6782-6794, 2022.

Error bounds of imitating policies and environments for reinforcement learning

Tian Xu, Ziniu Li, Yang Yu.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):6968-6980, 2022.

NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning

Rong-Jun Qin, Songyi Gao, Xingyuan Zhang, Zhen Xu, Shengkai Huang, Zewen Li, Weinan Zhang, Yang Yu.

In: Advances in Neural Information Processing Systems 35 (NeurIPS'22, Datasets and Benchmarks), New Orleans, LA, 2022. CoRR abs/2102.00714.

Efficient Multi-agent Communication via Self-supervised Information Aggregation

Cong Guan, Feng Chen, Lei Yuan, Chenghe Wang, Hao Yin, Zongzhang Zhang, Yang Yu.

In: Advances in Neural Information Processing Systems 35 (NeurIPS'22), New Orleans, LA, 2022.

Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning

Chenyang Wu, Tianci Li, Zongzhang Zhang, Yang Yu.

In: Advances in Neural Information Processing Systems 35 (NeurIPS'22), New Orleans, LA, 2022.

Multi-agent Dynamic Algorithm Configuration

Ke Xue, Jiacheng Xu, Lei Yuan, Miqing Li, Chao Qian, Zongzhang Zhang, Yang Yu.

In: Advances in Neural Information Processing Systems 35 (NeurIPS'22), New Orleans, LA, 2022.

Efficient reinforcement learning for StarCraft by abstract forward models and transfer learning

Ruo-Ze Liu, Haifeng Guo, Xiaozhong Ji, Yang Yu, Zhen-Jia Pang, Zitai Xiao, Yuzhou Wu, Tong Lu.

IEEE Transactions on Games, 14(2): 294-307, 2022.

The teaching dimension of regularized kernel learners

Hong Qian, Xu-Hui Liu, Chen-Xi Su, Aimin Zhou, Yang Yu.

In: Proceedings of the 39th International Conference on Machine Learning (ICML'22), 2022.

Efficient multi-agent communication via shapley message value

Di Xue, Lei Yuan, Zongzhang Zhang, Yang Yu.

In: Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI'22), Virtual Conference, 2022.

Multi-agent concentrative coordination with decentralized task representation

Lei Yuan, Chenghe Wang, Jianhao Wang, Fuxiang Zhang, Feng Chen, Cong Guan, Zongzhang Zhang, Chongjie Zhang, Yang Yu.

In: Proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI'22), Virtual Conference, 2022.

Rethinking ValueDice: Does It Really Improve Performance?

Ziniu Li, Tian Xu, Yang Yu, Zhi-Quan Luo.

In: Blog Track at 10th International Conference on Learning Representations (ICLR'22 Blog Track), 2022, CoRR abs/2202.02468.

Context-aware sparse deep coordination graphs

Tonghan Wang, Liang Zeng, Weijun Dong, Qianlan Yang, Yang Yu, Chongjie Zhang.

In: Proceedings of the 10th International Conference on Learning Representations (ICLR'22), Virtual Conference, 2022.

Active hierarchical exploration with stable subgoal representation learning

Siyuan Li, Jin Zhang, Jianhao Wang, Yang Yu, Chongjie Zhang.

In: Proceedings of the 10th International Conference on Learning Representations (ICLR'22), Virtual Conference, 2022.

Learning efficient online 3D bin packing on packing configuration trees

Hang Zhao, Yang Yu, Kai Xu.

In: Proceedings of the 10th International Conference on Learning Representations (ICLR'22), Virtual Conference, 2022.

Improve generated adversarial imitation learning with reward variance regularization

Yi-Feng Zhang, Fan-Ming Luo, Yang Yu.

Machine Learning, 2022.

Adapt to environment sudden changes by learning context sensitive policy

Fan-Ming Luo, Shengyi Jiang, Yang Yu, Zongzhang Zhang, Yi-Feng Zhang.

In: Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI'22), Virtual Conference, 2022.

Invariant action effect model for reinforcement learning

Zheng-Mao Zhu, Shengyi Jiang, Yu-Ren Liu, Yang Yu, Kun Zhang.

In: Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI'22), Virtual Conference, 2022.

Multi-agent incentive communication via decentralized teammate modeling

Lei Yuan, Jianhao Wang, Fuxiang Zhang, Chenghe Wang, Zongzhang Zhang, Yang Yu, Chongjie Zhang.

In: Proceedings of the 36th AAAI Conference on Artificial Intelligence (AAAI'22), Virtual Conference, 2022.

ZOOpt: Toolbox for derivative-free optimization

Yu-Ren Liu, Yi-Qi Hu, Hong Qian, Yang Yu, and Chao Qian.

SCIENCE CHINA Information Sciences, 65: 207101, 2022. CoRR abs/1801.00329.

2021

More efficient adversarial imitation learning algorithms with known and unknown transitions

Tian Xu, Ziniu Li, Yang Yu.

In: Ecological Theory of RL Workshop in NeurIPS 2021.

Offline model-based adaptable policy learning

Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Tony Qin, Shang Wenjie, Jieping Ye.

In: Advances in Neural Information Processing Systems 34 (NeurIPS'21), Virtual Conference, 2021.

Regret minimization experience replay in off-policy reinforcement learning

Xu-Hui Liu, Zhenghai Xue, Jing-Cheng Pang, Shengyi Jiang, Feng Xu, Yang Yu.

In: Advances in Neural Information Processing Systems 34 (NeurIPS'21), Virtual Conference, 2021.

Cross-modal domain adaptation for cost-efficient visual reinforcement learning

Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang, Yang Yu.

In: Advances in Neural Information Processing Systems 34 (NeurIPS'21), Virtual Conference, 2021.

Adaptive online packing-guided search for POMDPs

Chenyang Wu, Guoyu Yang, Zongzhang Zhang, Yang Yu, Dong Li, Wulong Liu, Jianye Hao.

In: Advances in Neural Information Processing Systems 34 (NeurIPS'21), Virtual Conference, 2021.

Fast Pareto optimization for subset selection with dynamic cost constraints

Chao Bian, Chao Qian, Frank Neumann, and Yang Yu.

In: Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI'21), Virtual Conference, 2021.

QPLEX: Duplex dueling multi-agent Q-Learning

Jianhao Wang, Zhizhou Ren, Terry Liu, Yang Yu, and Chongjie Zhang.

In: Proceedings of the 9th International Conference on Learning Representations (ICLR'21), Virtual Conference, 2021.

Sequential and dynamic constraint contrastive learning for reinforcement learning

Weijie Shen, Lei Yuan, Junfu Huang, Songyi Gao, Yuyang Huang, Yang Yu.

In: Proceedings of the IEEE 2021 International Joint Conference on Neural Networks (IJCNN'21), Shenzhen, China, 2021.

Partially observable environment estimation with uplift inference for reinforcement learning based recommendation

Wenjie Shang, Qingyang Li, Zhiwei Qin, Yang Yu, Yiping Meng, Jieping Ye.

Machine Learning, 110(9): 2603-2640, 2021.

Improving Search Engine Efficiency through Contextual Factor Selection

Anxiang Zeng, Han Yu, Qing Da, Yusen Zhan, Yang Yu, Jingren Zhou, Chunyan Miao.

AI Magazine, 42(2): 50-58, 2021.

Derivative-free reinforcement learning: A review

Hong Qian, Yang Yu.

Frontiers of Computer Science, 15(6): 156336, 2022. CoRR abs/2102.05710.

Machine learning steered symbolic execution framework for complex software code

Lei Bu, Yongjuan Liang, Zhunyi Xie, Hong Qian, Yi-Qi Hu, Yang Yu, Xin Chen, Xuandong Li.

Formal Aspects of Computing, 33(3): 301-323, 2021.

Analysis of Noisy Evolutionary Optimization When Sampling Fails

Chao Qian, Chao Bian, Yang Yu, Ke Tang, and Xin Yao.

Algorithmica, 83(4): 940-975, 2021.

On the robustness of median sampling in noisy evolutionary optimization

Chao Bian, Chao Qian, Yang Yu, Ke Tang.

Science China Information Sciences, 64(5), 2021.

2020

Error bounds of imitating policies and environments

Tian Xu, Ziniu Li, Yang Yu.

In: Advances in Neural Information Processing Systems 33 (NeurIPS'20), Virtual Conference, 2020. (PDF).

Offline imitation learning with a misspecified simulator

Shengyi Jiang, Jing-Cheng Pang, Yang Yu.

In: Advances in Neural Information Processing Systems 33 (NeurIPS'20), Virtual Conference, 2020. (PDF).

Running time analysis of the (1+1)-EA for robust linear optimization

Chao Bian, Chao Qian, Ke Tang, Yang Yu.

Theoretical Computer Science, 843: 57-72, 2020.

A technical view on neural architecture search

Yi-Qi Hu and Yang Yu.

International Journal on Machine Learning and Cybernetics, 11(4): 795-811, 2020.

Reinforcement learning with action-specific focuses in video games

Meng Wang, Yingfeng Chen, Tangjie Lv, Yan Song, Kai Guan, Changjie Fan, Yang Yu.

In: Proceedings of IEEE 2020 Conference on Games (CoG'20), Osaka, Japan, 2020, pp.9-16.

Derivative-free optimization with adaptive experience for efficient hyper-parameter tuning

Yi-Qi Hu, Zelin Liu, Hua Yang, Yang Yu, and Yunfeng Liu.

In: Proceedings of the 24th European Conference on Artificial Intelligence (ECAI'20), Santiago de Compostela, Spain, 2020. (PDF).

An efficient evolutionary algorithm for subset selection with general cost constraints

Chao Bian, Chao Feng, Chao Qian, and Yang Yu.

In: Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI'20), New York, NY, 2020. (PDF).

Enhancing neural mathematical reasoning by abductive combination with symbolic library

Yangyang Hu, Yang Yu.

In: ICML 2020 Workshop on Bridge Between Perception and Reasoning: Graph Neural Networks & Beyond, Vienna, Austria, 2020.

LAMDA  RL LAB
School of Artificial Intelligence
National Key Laboratory for Novel Software Technology
Nanjing University, Nanjing 210023, China

Contact us

yuanl AT lamda DOT nju DOT edu DOT cn

Yi Fu Building, Xianlin Campus