Publications tagged "Reinforcement learning"


Conferences

Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks

Litian Liang, Yaosheng Xu, Stephen McAleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, and Roy Fox

39th International Conference on Machine Learning (ICML), 2022


Learning to Query Internet Text for Informing Reinforcement Learning Agents

Kolby Nottingham, Alekhya Pyla, Sameer Singh, and Roy Fox

Reinforcement Learning and Decision Making (RLDM), 2022


Independent Natural Policy Gradient Always Converges in Markov Potential Games

Roy Fox, Stephen McAleer, William Overman, and Ioannis Panageas

25th International Conference on Artificial Intelligence and Statistics (AISTATS), 2022


XDO: A Double Oracle Algorithm for Extensive-Form Games

Stephen McAleer, JB Lanier, Kevin Wang, Pierre Baldi, and Roy Fox

35th Conference on Neural Information Processing Systems (NeurIPS), 2021


Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Stephen McAleer*, JB Lanier*, Roy Fox, and Pierre Baldi

34th Conference on Neural Information Processing Systems (NeurIPS), 2020


Workshops

Anytime PSRO for Two-Player Zero-Sum Games

Stephen McAleer, Kevin Wang, JB Lanier, Marc Lanctot, Pierre Baldi, Tuomas Sandholm, and Roy Fox

Reinforcement Learning in Games workshop (RLG @ AAAI), 2022


Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates

Litian Liang, Yaosheng Xu, Stephen McAleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, and Roy Fox

Deep Reinforcement Learning workshop (DRL @ NeurIPS), 2021


Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning

Dailin Hu, Pieter Abbeel, and Roy Fox

Deep Reinforcement Learning workshop (DRL @ NeurIPS), 2021


Target Entropy Annealing for Discrete Soft Actor–Critic

Yaosheng Xu, Dailin Hu, Litian Liang, Stephen McAleer, Pieter Abbeel, and Roy Fox

Deep Reinforcement Learning workshop (DRL @ NeurIPS), 2021


Obtaining Approximately Admissible Heuristic Functions through Deep Reinforcement Learning and A* Search

Forest Agostinelli, Stephen McAleer, Alexander Shmakov, Roy Fox, Marco Valtorta, Biplav Srivastava, and Pierre Baldi

Bridging the Gap between AI Planning and Reinforcement Learning workshop (PRL @ ICAPS), 2021


CFR-DO: A Double Oracle Algorithm for Extensive-Form Games

Stephen McAleer, JB Lanier, Pierre Baldi, and Roy Fox

Reinforcement Learning in Games workshop (RLG @ AAAI), 2021


Toward Provably Unbiased Temporal-Difference Value Estimation

Roy Fox

Optimization Foundations for Reinforcement Learning workshop (OPTRL @ NeurIPS), 2019


Preprints

Improving Social Welfare while Preserving Autonomy via a Pareto Mediator

Stephen McAleer, JB Lanier, Michael Dennis, Pierre Baldi, and Roy Fox

arXiv:2106.03927, 2021


A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks

Forest Agostinelli, Alexander Shmakov, Stephen McAleer, Roy Fox, and Pierre Baldi

arXiv:2102.04518, 2021