2014 Motion Optimization
Broad topics
Stochastic optimal control, graphical model representations, linear solvability.
Partial observability and predictive state representations
Trajectory optimization
Tractable reading list subset
Marc Toussaint and Amos Storkey and Stefan Harmeling: Expectation-Maximization methods for solving (PO)MDPs. In Bayesian Time Series Models, 388-413, Cambridge University Press, 2011.
Konrad Rawlik and Marc Toussaint and Sethu Vijayakumar: Approximate Inference and Stochastic Optimal Control. e-Print arXiv:1009.3958, 201
Todorov E (2006). In Bayesian Brain: Probabilistic Approaches to Neural Coding, Doya K at al (eds), chap 12, pp 269-298, MIT
Linearly-solvable optimal control
Dvijotham K and Todorov E (2012). In Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, Lewis (ed), chap. 6, Wiley and IEEE Press, in press
General duality between optimal control and estimation
Todorov E (2008). In proceedings of the 47th IEEE Conference on Decision and Control, pp 4286 - 4292
Value-function approximations for partially observable Markov decision processes .
M. Hauskrecht.
Journal of Artificial Intelligence Research, vol.13, pp. 33-94, 2000
Online Planning Algorithms for POMDPs. [pdf]
S. Ross, J. Pineau, S. Paquet & B. Chaib-draa
In Journal of Artificial Intelligence Research (JAIR), vol. 32, p. 663-704, 2008.
Non-Gaussian Belief Space Planning: Correctness and Complexity,
Platt, R., Kaelbling, L., Lozano-Perez, T., Tedrake, R.
IEEE Int'l Conf. on Robotics and Automation, 2012. (The final version of the paper posted here fixes some errors that were present in the proofs in the submitted version.)
Sigma Hulls for Gaussian Belief Space Planning for Imprecise Articulated Robots amid Obstacles.
Alex Lee, Yan (Rocky) Duan, Sachin Patil, John Schulman, Zoe McCarthy, Jur van den Berg, Ken Goldberg, Pieter Abbeel.
In the proceedings of the 26th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013.
Predictive State Representations: A New Theory for Modeling Dynamical Systems
Satinder Singh, Michael R. James, Matthew R. Rudary. UAI 2004
Reduced-Rank Hidden Markov Models.
S. M. Siddiqi, B. Boots & G. J. Gordon. AIStats 2010
Closing the Learning-Planning Loop with Predictive State Representations.
B. Boots, S. M. Siddiqi & G. J. Gordon. RSS 2010 (also a 2011 IJRR paper)
Full collection of papers
(Stochastic) Optimal control
Todorov E (2006). In Bayesian Brain: Probabilistic Approaches to Neural Coding, Doya K at al (eds), chap 12, pp 269-298, MIT
Russ Tedrake. Underactuated Robotics: Learning, Planning, and Control for Efficient and Agile Machines: Course Notes for MIT 6.832. Working draft edition, 2012.
Konrad Rawlik and Marc Toussaint and Sethu Vijayakumar: An Approximate Inference Approach to Temporal Optimization in Optimal Control. In Proc. Advances in Neural Information Processing Systems (NIPS 2010), 2010.
Konrad Rawlik and Marc Toussaint and Sethu Vijayakumar: Approximate Inference and Stochastic Optimal Control. e-Print arXiv:1009.3958, 201
Marc Toussaint: Robot Trajectory Optimization using Approximate Inference. In Proc. of the Int. Conf. on Machine Learning (ICML 2009), 1049-1056, ACM, 2009.
Stochastic differential dynamic programming
Theodorou E, Tassa Y and Todorov E (2010). In American Control Conference
Li W and Todorov E (2007). International Journal of Control, 80: 1439-1453
Todorov E and Li W (2005). In proceedings of the American Control Conference, pp 300-306
Path integrals, KL-control, and Linearly solvable control
Linearly-solvable optimal control
Dvijotham K and Todorov E (2012). In Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, Lewis (ed), chap. 6, Wiley and IEEE Press, in press
Parallels between sensory and motor information processing
Todorov E (2008). In The Cognitive Neurosciences, 4th ed, Gazzaniga (ed), MIT Press
General duality between optimal control and estimation
Todorov E (2008). In proceedings of the 47th IEEE Conference on Decision and Control, pp 4286 - 4292
Inverse optimal control with linearly-solvable MDPs
Dvijotham K and Todorov E (2010). In International Conference on Machine Learning
Eigenfunction approximation methods for linearly-solvable optimal control problems
Todorov E (2009). In proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp 161 - 168
Linearly-solvable Markov decision problems
Todorov E (2006). In Advances in Neural Information Processing Systems 19: 1369-1376, Scholkopf et al (eds), MIT Press
Floating based trajectory optimization
An integrated system for real-time model-predictive control of humanoid robots
Erez T, Lowrey K, Tassa Y, Kumar V, Kolev S and Todorov E (2013). In IEEE/RAS International Conference on Humanoid Robots [Movie]
Synthesis and stabilization of complex behaviors through online trajectory optimization
Tassa Y, Erez T and Todorov E (2012). In IEEE/RSJ International Conference on Intelligent Robots and Systems [Movie]
Li W and Todorov E (2007). International Journal of Control, 80: 1439-1453
Todorov E and Li W (2005). In proceedings of the American Control Conference, pp 300-30
A direct method for trajectory optimization of rigid bodies through contact.
Michael Posa, Cecilia Cantu, and Russ Tedrake.
The International Journal of Robotics Research (IJRR), 33(1):69-81, January 2014.[ .avi]
M. Fallon, S. Kuindersma, S. Karumanchi, M. Antone, T. Schneider, H. Dai, C. PĂ©rez D'Arpino, R. Deits, M. DiCicco, D. Fourie, T. Koolen, P. Marion, M. Posa, A. Valenzuela, K. Yu, J. Shah, K. Iagnemma, R. Tedrake, S. Teller. An Architecture for Online Affordance-based Perception and Whole-body Planning, May 2014. MIT CSAIL Technical Report 2014-003.
POMDPs
Value-function approximations for partially observable Markov decision processes .
M. Hauskrecht.
Journal of Artificial Intelligence Research, vol.13, pp. 33-94, 2000
Online Planning Algorithms for POMDPs. [pdf]
S. Ross, J. Pineau, S. Paquet & B. Chaib-draa
In Journal of Artificial Intelligence Research (JAIR), vol. 32, p. 663-704, 2008.
Monte-Carlo Planning in Large POMDPs
David Silver, Joel Veness
Neural Information Processing Systems (NIPS), 2010 pdf video and code
Integrated Perception and Planning in the Continuous Space: A POMDP Approach
Haoyu Bai, David Hsu, Wee Sun Lee
A Survey of Monte Carlo Tree Search Methods
Cameron Browne, Edward Powley, Daniel Whitehouse, Simon Lucas, Peter I. Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis and Simon Colton
G. Shani, J. Pineau, R. Kaplow. "A survey of point-based POMDP solvers". Autonomous Agents and Multi-Agent Systems. 2012. [.pdf]
Graphical model representations of POMDPs
Marc Toussaint and Amos Storkey and Stefan Harmeling: Expectation-Maximization methods for solving (PO)MDPs. In Bayesian Time Series Models, 388-413, Cambridge University Press, 2011.
Pascal Poupart and Marc Toussaint and Tobias Lang: Analyzing and Escaping Local Optima in Planning as Inference for Partially Observable Domains. In European Conf. on Machine Learning (ECML 2011), 2011.
Marc Toussaint and Laurent Charlin and Pascal Poupart: Hierarchical POMDP Controller Optimization by Likelihood Maximization. In Uncertainty in Artificial Intelligence (UAI 2008), 562-570, AUAI Press, 2008.
Gaussian belief space dynamics
Sigma Hulls for Gaussian Belief Space Planning for Imprecise Articulated Robots amid Obstacles,
Alex Lee, Yan (Rocky) Duan, Sachin Patil, John Schulman, Zoe McCarthy, Jur van den Berg, Ken Goldberg, Pieter Abbeel.
In the proceedings of the 26th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013.
LQG-MP: Optimized Path Planning for Robots with Motion Uncertainty and Imperfect State Information,
Jur van den Berg, Pieter Abbeel, Ken Goldberg.
In the International Journal of Robotics Research (IJRR), first published on June 3, 2011 as doi:10.1177/0278364911406562.
Motion Planning with Sequential Convex Optimization and Convex Collision Checking,
John Schulman, Yan Duan, Jonathan Ho, Alex Lee, Ibrahim Awwal, Henry Bradlow, Jia Pan, Sachin Patil, Ken Goldberg, Pieter Abbeel.
In the International Journal of Robotics Research (IJRR), 2014
Gaussian Belief Space Planning with Discontinuities in Sensing Domains,
Sachin Patil, Yan Duan, John Schulman, Ken Goldberg, Pieter Abbeel.
In the proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2014.
Non-Gaussian belief space planning
Platt, R., Tedrake, R., Kaelbling, L., Lozano-Perez, T., Belief space planning assuming maximum likelihood observations, Proceedings of Robotics: Science and Systems 2010 (RSS), Zaragosa, Spain, June 27, 2010, (Slides from presentation).
Platt, R., Kaelbling, L., Lozano-Perez, T., Tedrake, R.Efficient planning in non-Gaussian belief spaces and its application to robot grasping, Proceedings of the International Symposium on Robotics Research, 2011. (Extended version available in CSAIL Tech Report MIT-CSAIL-TR-2011-039
Platt, R., Kaelbling, L., Lozano-Perez, T., Tedrake, R. Non-Gaussian Belief Space Planning: Correctness and Complexity, IEEE Int'l Conf. on Robotics and Automation, 2012. (The final version of the paper posted here fixes some errors that were present in the proofs in the submitted version.)
Platt, R. Convex receding horizon control in non-Gaussian belief space, Proceedings of the Workshop on the Algorithmic Foundations of Robotics, 2012.
A Scalable Method for Solving High-Dimensional Continuous POMDPs Using Local Approximation
T. Erez, W.D. Smart, Proceedings of the 26th Conference in Uncertainty in Artificial Intelligence (UAI), 2010.
Predictive state representations and spectral methods
Predictive Representations of State.
Michael L. Littman, Richard S. Sutton, Satinder Singh. NIPS 2002
Predictive State Representations: A New Theory for Modeling Dynamical Systems
Satinder Singh, Michael R. James, Matthew R. Rudary. UAI 2004.
Reduced-Rank Hidden Markov Models.
S. M. Siddiqi, B. Boots & G. J. Gordon. AIStats 2010
Hilbert Space Embeddings of Hidden Markov Models.
L. Song, B. Boots, S. M. Siddiqi, G. J. Gordon & A. J. Smola ICML 2010
Closing the Learning-Planning Loop with Predictive State Representations.
B. Boots, S. M. Siddiqi & G. J. Gordon. RSS 2010 (also a 2011 IJRR paper)
Hilbert Space Embeddings of Predictive State Representations.
B. Boots, A. Gretton, & G. J. Gordon. UAI 2013
[Recent advances]