# 2014 Motion Optimization

**Broad topics**

Stochastic optimal control, graphical model representations, linear solvability.

Partial observability and predictive state representations

Trajectory optimization

**Tractable reading list subset**

Marc Toussaint and Amos Storkey and Stefan Harmeling: Expectation-Maximization methods for solving (PO)MDPs. In Bayesian Time Series Models, 388-413, Cambridge University Press, 2011.

Konrad Rawlik and Marc Toussaint and Sethu Vijayakumar: Approximate Inference and Stochastic Optimal Control. e-Print arXiv:1009.3958, 201

Todorov E (2006). In Bayesian Brain: Probabilistic Approaches to Neural Coding, Doya K at al (eds), chap 12, pp 269-298, MIT

Linearly-solvable optimal control

Dvijotham K and Todorov E (2012). In Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, Lewis (ed), chap. 6, Wiley and IEEE Press, in press

General duality between optimal control and estimation

Todorov E (2008). In proceedings of the 47th IEEE Conference on Decision and Control, pp 4286 - 4292

Value-function approximations for partially observable Markov decision processes .

M. Hauskrecht.

Journal of Artificial Intelligence Research, vol.13, pp. 33-94, 2000

Online Planning Algorithms for POMDPs. [pdf]

S. Ross, J. Pineau, S. Paquet & B. Chaib-draa

In Journal of Artificial Intelligence Research (JAIR), vol. 32, p. 663-704, 2008.

Non-Gaussian Belief Space Planning: Correctness and Complexity,

Platt, R., Kaelbling, L., Lozano-Perez, T., Tedrake, R.

IEEE Int'l Conf. on Robotics and Automation, 2012. (The final version of the paper posted here fixes some errors that were present in the proofs in the submitted version.)

Sigma Hulls for Gaussian Belief Space Planning for Imprecise Articulated Robots amid Obstacles.

Alex Lee, Yan (Rocky) Duan, Sachin Patil, John Schulman, Zoe McCarthy, Jur van den Berg, Ken Goldberg, Pieter Abbeel.

In the proceedings of the 26th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013.

Predictive State Representations: A New Theory for Modeling Dynamical Systems

Satinder Singh, Michael R. James, Matthew R. Rudary. UAI 2004

Reduced-Rank Hidden Markov Models.

S. M. Siddiqi, B. Boots & G. J. Gordon. AIStats 2010

Closing the Learning-Planning Loop with Predictive State Representations.

B. Boots, S. M. Siddiqi & G. J. Gordon. RSS 2010 (also a 2011 IJRR paper)

**Full collection of papers**

**(Stochastic) Optimal control**

Todorov E (2006). In Bayesian Brain: Probabilistic Approaches to Neural Coding, Doya K at al (eds), chap 12, pp 269-298, MIT

Russ Tedrake. Underactuated Robotics: Learning, Planning, and Control for Efficient and Agile Machines: Course Notes for MIT 6.832. Working draft edition, 2012.

Konrad Rawlik and Marc Toussaint and Sethu Vijayakumar: An Approximate Inference Approach to Temporal Optimization in Optimal Control. In Proc. Advances in Neural Information Processing Systems (NIPS 2010), 2010.

Konrad Rawlik and Marc Toussaint and Sethu Vijayakumar: Approximate Inference and Stochastic Optimal Control. e-Print arXiv:1009.3958, 201

Marc Toussaint: Robot Trajectory Optimization using Approximate Inference. In Proc. of the Int. Conf. on Machine Learning (ICML 2009), 1049-1056, ACM, 2009.

Stochastic differential dynamic programming

Theodorou E, Tassa Y and Todorov E (2010). In American Control Conference

Li W and Todorov E (2007). International Journal of Control, 80: 1439-1453

Todorov E and Li W (2005). In proceedings of the American Control Conference, pp 300-306

**Path integrals, KL-control, and Linearly solvable control**

Linearly-solvable optimal control

Dvijotham K and Todorov E (2012). In Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, Lewis (ed), chap. 6, Wiley and IEEE Press, in press

Parallels between sensory and motor information processing

Todorov E (2008). In The Cognitive Neurosciences, 4th ed, Gazzaniga (ed), MIT Press

General duality between optimal control and estimation

Todorov E (2008). In proceedings of the 47th IEEE Conference on Decision and Control, pp 4286 - 4292

Inverse optimal control with linearly-solvable MDPs

Dvijotham K and Todorov E (2010). In International Conference on Machine Learning

Eigenfunction approximation methods for linearly-solvable optimal control problems

Todorov E (2009). In proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp 161 - 168

Linearly-solvable Markov decision problems

Todorov E (2006). In Advances in Neural Information Processing Systems 19: 1369-1376, Scholkopf et al (eds), MIT Press

**Floating based trajectory optimization**

An integrated system for real-time model-predictive control of humanoid robots

Erez T, Lowrey K, Tassa Y, Kumar V, Kolev S and Todorov E (2013). In IEEE/RAS International Conference on Humanoid Robots [Movie]

Synthesis and stabilization of complex behaviors through online trajectory optimization

Tassa Y, Erez T and Todorov E (2012). In IEEE/RSJ International Conference on Intelligent Robots and Systems [Movie]

Li W and Todorov E (2007). International Journal of Control, 80: 1439-1453

Todorov E and Li W (2005). In proceedings of the American Control Conference, pp 300-30

A direct method for trajectory optimization of rigid bodies through contact.

Michael Posa, Cecilia Cantu, and Russ Tedrake.

The International Journal of Robotics Research (IJRR), 33(1):69-81, January 2014.[ .avi]

M. Fallon, S. Kuindersma, S. Karumanchi, M. Antone, T. Schneider, H. Dai, C. PĂ©rez D'Arpino, R. Deits, M. DiCicco, D. Fourie, T. Koolen, P. Marion, M. Posa, A. Valenzuela, K. Yu, J. Shah, K. Iagnemma, R. Tedrake, S. Teller. An Architecture for Online Affordance-based Perception and Whole-body Planning, May 2014. MIT CSAIL Technical Report 2014-003.

**POMDPs**

Value-function approximations for partially observable Markov decision processes .

M. Hauskrecht.

Journal of Artificial Intelligence Research, vol.13, pp. 33-94, 2000

Online Planning Algorithms for POMDPs. [pdf]

S. Ross, J. Pineau, S. Paquet & B. Chaib-draa

In Journal of Artificial Intelligence Research (JAIR), vol. 32, p. 663-704, 2008.

Monte-Carlo Planning in Large POMDPs

David Silver, Joel Veness

Neural Information Processing Systems (NIPS), 2010 pdf video and code

Integrated Perception and Planning in the Continuous Space: A POMDP Approach

Haoyu Bai, David Hsu, Wee Sun Lee

A Survey of Monte Carlo Tree Search Methods

Cameron Browne, Edward Powley, Daniel Whitehouse, Simon Lucas, Peter I. Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis and Simon Colton

G. Shani, J. Pineau, R. Kaplow. "A survey of point-based POMDP solvers". Autonomous Agents and Multi-Agent Systems. 2012. [.pdf]

**Graphical model representations of POMDPs**

Marc Toussaint and Amos Storkey and Stefan Harmeling: Expectation-Maximization methods for solving (PO)MDPs. In Bayesian Time Series Models, 388-413, Cambridge University Press, 2011.

Pascal Poupart and Marc Toussaint and Tobias Lang: Analyzing and Escaping Local Optima in Planning as Inference for Partially Observable Domains. In European Conf. on Machine Learning (ECML 2011), 2011.

Marc Toussaint and Laurent Charlin and Pascal Poupart: Hierarchical POMDP Controller Optimization by Likelihood Maximization. In Uncertainty in Artificial Intelligence (UAI 2008), 562-570, AUAI Press, 2008.

**Gaussian belief space dynamics**

Sigma Hulls for Gaussian Belief Space Planning for Imprecise Articulated Robots amid Obstacles,

Alex Lee, Yan (Rocky) Duan, Sachin Patil, John Schulman, Zoe McCarthy, Jur van den Berg, Ken Goldberg, Pieter Abbeel.

In the proceedings of the 26th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013.

LQG-MP: Optimized Path Planning for Robots with Motion Uncertainty and Imperfect State Information,

Jur van den Berg, Pieter Abbeel, Ken Goldberg.

In the International Journal of Robotics Research (IJRR), first published on June 3, 2011 as doi:10.1177/0278364911406562.

Motion Planning with Sequential Convex Optimization and Convex Collision Checking,

John Schulman, Yan Duan, Jonathan Ho, Alex Lee, Ibrahim Awwal, Henry Bradlow, Jia Pan, Sachin Patil, Ken Goldberg, Pieter Abbeel.

In the International Journal of Robotics Research (IJRR), 2014

Gaussian Belief Space Planning with Discontinuities in Sensing Domains,

Sachin Patil, Yan Duan, John Schulman, Ken Goldberg, Pieter Abbeel.

In the proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2014.

**Non-Gaussian belief space planning**

Platt, R., Tedrake, R., Kaelbling, L., Lozano-Perez, T., Belief space planning assuming maximum likelihood observations, Proceedings of Robotics: Science and Systems 2010 (RSS), Zaragosa, Spain, June 27, 2010, (Slides from presentation).

Platt, R., Kaelbling, L., Lozano-Perez, T., Tedrake, R.Efficient planning in non-Gaussian belief spaces and its application to robot grasping, Proceedings of the International Symposium on Robotics Research, 2011. (Extended version available in CSAIL Tech Report MIT-CSAIL-TR-2011-039

Platt, R., Kaelbling, L., Lozano-Perez, T., Tedrake, R. Non-Gaussian Belief Space Planning: Correctness and Complexity, IEEE Int'l Conf. on Robotics and Automation, 2012. (The final version of the paper posted here fixes some errors that were present in the proofs in the submitted version.)

Platt, R. Convex receding horizon control in non-Gaussian belief space, Proceedings of the Workshop on the Algorithmic Foundations of Robotics, 2012.

A Scalable Method for Solving High-Dimensional Continuous POMDPs Using Local Approximation

T. Erez, W.D. Smart, Proceedings of the 26th Conference in Uncertainty in Artificial Intelligence (UAI), 2010.

**Predictive state representations and spectral methods**

Predictive Representations of State.

Michael L. Littman, Richard S. Sutton, Satinder Singh. NIPS 2002

Predictive State Representations: A New Theory for Modeling Dynamical Systems

Satinder Singh, Michael R. James, Matthew R. Rudary. UAI 2004.

Reduced-Rank Hidden Markov Models.

S. M. Siddiqi, B. Boots & G. J. Gordon. AIStats 2010

Hilbert Space Embeddings of Hidden Markov Models.

L. Song, B. Boots, S. M. Siddiqi, G. J. Gordon & A. J. Smola ICML 2010

Closing the Learning-Planning Loop with Predictive State Representations.

B. Boots, S. M. Siddiqi & G. J. Gordon. RSS 2010 (also a 2011 IJRR paper)

Hilbert Space Embeddings of Predictive State Representations.

B. Boots, A. Gretton, & G. J. Gordon. UAI 2013

[Recent advances]