All Publications


Conferences

XDO: A Double Oracle Algorithm for Extensive-Form Games

Stephen McAleer, JB Lanier, Kevin A. Wang, Pierre Baldi, and Roy Fox

35th Conference on Neural Information Processing Systems (NeurIPS), 2021


Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Stephen McAleer*, JB Lanier*, Roy Fox, and Pierre Baldi

34th Conference on Neural Information Processing Systems (NeurIPS), 2020


AutoPandas: Neural-Backed Generators for Program Synthesis

Rohan Bavishi, Caroline Lemieux, Roy Fox, Koushik Sen, and Ion Stoica

10th ACM SIGPLAN Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH OOPSLA), 2019


Multi-Task Hierarchical Imitation Learning for Home Automation

Roy Fox*, Ron Berenstein*, Ion Stoica, and Ken Goldberg

15th IEEE Conference on Automation Science and Engineering (CASE), 2019


Workshops

Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates

Litian Liang, Yaosheng Xu, Stephen McAleer, Dailin Hu, Alexander Ihler, Pieter Abbeel, and Roy Fox

Deep Reinforcement Learning workshop (DRL @ NeurIPS), 2021


Count-Based Temperature Scheduling for Maximum Entropy Reinforcement Learning

Dailin Hu, Pieter Abbeel, and Roy Fox

Deep Reinforcement Learning workshop (DRL @ NeurIPS), 2021


Target Entropy Annealing for Discrete Soft Actor–Critic

Yaosheng Xu, Dailin Hu, Litian Liang, Stephen McAleer, Pieter Abbeel, and Roy Fox

Deep Reinforcement Learning workshop (DRL @ NeurIPS), 2021


Obtaining Approximately Admissible Heuristic Functions through Deep Reinforcement Learning and A* Search

Forest Agostinelli, Stephen McAleer, Alexander Shmakov, Roy Fox, Marco Valtorta, Biplav Srivastava, and Pierre Baldi

Bridging the Gap between AI Planning and Reinforcement Learning workshop (PRL @ ICAPS), 2021


Modular Framework for Visuomotor Language Grounding

Kolby Nottingham, Litian Liang, Daeyun Shin, Charless C. Fowlkes, Roy Fox, and Sameer Singh

Embodied AI workshop (EmbodiedAI @ CVPR), 2021


CFR-DO: A Double Oracle Algorithm for Extensive-Form Games

Stephen McAleer, JB Lanier, Pierre Baldi, and Roy Fox

Reinforcement Learning in Games workshop (RLG @ AAAI), 2021


Toward Provably Unbiased Temporal-Difference Value Estimation

Roy Fox

Optimization Foundations for Reinforcement Learning workshop (OPTRL @ NeurIPS), 2019



Preprints

Improving Social Welfare while Preserving Autonomy via a Pareto Mediator

Stephen McAleer, JB Lanier, Michael Dennis, Pierre Baldi, and Roy Fox

arXiv:2106.03927, 2021


A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks

Forest Agostinelli, Alexander Shmakov, Stephen McAleer, Roy Fox, and Pierre Baldi

arXiv:2102.04518, 2021


Hierarchical Variational Imitation Learning of Control Programs

Roy Fox, Richard Shin, William Paul, Yitian Zou, Dawn Song, Ken Goldberg, Pieter Abbeel, and Ion Stoica

arXiv:1912.12612, 2019