Reading lists‎ > ‎

2014 Motion Optimization

Broad topics
  • Stochastic optimal control, graphical model representations, linear solvability.
  • Partial observability and predictive state representations
  • Trajectory optimization



Tractable reading list subset



Marc Toussaint and Amos Storkey and Stefan Harmeling: Expectation-Maximization methods for solving (PO)MDPs. In Bayesian Time Series Models, 388-413, Cambridge University Press, 2011.

Konrad Rawlik and Marc Toussaint and Sethu Vijayakumar: Approximate Inference and Stochastic Optimal Control. e-Print arXiv:1009.3958, 201

Optimal control theory
Todorov E (2006). In Bayesian Brain: Probabilistic Approaches to Neural Coding, Doya K at al (eds), chap 12, pp 269-298, MIT 

Linearly-solvable optimal control
Dvijotham K and Todorov E (2012). In Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, Lewis (ed), chap. 6, Wiley and IEEE Press, in press 

General duality between optimal control and estimation
Todorov E (2008). In proceedings of the 47th IEEE Conference on Decision and Control, pp 4286 - 4292 

Value-function approximations for partially observable Markov decision processes .
M. Hauskrecht.
Journal of Artificial Intelligence Research, vol.13, pp. 33-94, 2000

Online Planning Algorithms for POMDPs. [pdf]
S. Ross, J. Pineau, S. Paquet & B. Chaib-draa
In Journal of Artificial Intelligence Research (JAIR), vol. 32, p. 663-704, 2008.

Platt, R., Kaelbling, L., Lozano-Perez, T., Tedrake, R.
IEEE Int'l Conf. on Robotics and Automation, 2012. (The final version of the paper posted here fixes some errors that were present in the proofs in the submitted version.)

Sigma Hulls for Gaussian Belief Space Planning for Imprecise Articulated Robots amid Obstacles. 
Alex Lee, Yan (Rocky) Duan, Sachin Patil, John Schulman, Zoe McCarthy, Jur van den Berg, Ken Goldberg, Pieter Abbeel.
In the proceedings of the 26th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013.

Satinder Singh, Michael R. James, Matthew R. Rudary. UAI 2004

Reduced-Rank Hidden Markov Models. 
S. M. Siddiqi, B. Boots & G. J. Gordon. AIStats 2010

Closing the Learning-Planning Loop with Predictive State Representations. 
B. Boots, S. M. Siddiqi & G. J. Gordon. RSS 2010 (also a 2011 IJRR paper)




Full collection of papers




(Stochastic) Optimal control


Optimal control theory
Todorov E (2006). In Bayesian Brain: Probabilistic Approaches to Neural Coding, Doya K at al (eds), chap 12, pp 269-298, MIT 

Russ Tedrake. Underactuated Robotics: Learning, Planning, and Control for Efficient and Agile Machines: Course Notes for MIT 6.832. Working draft edition, 2012. 

Konrad Rawlik and Marc Toussaint and Sethu Vijayakumar: An Approximate Inference Approach to Temporal Optimization in Optimal Control. In Proc. Advances in Neural Information Processing Systems (NIPS 2010), 2010. 

Konrad Rawlik and Marc Toussaint and Sethu Vijayakumar: Approximate Inference and Stochastic Optimal Control. e-Print arXiv:1009.3958, 201

Marc Toussaint: Robot Trajectory Optimization using Approximate Inference. In Proc. of the Int. Conf. on Machine Learning (ICML 2009), 1049-1056, ACM, 2009.

Stochastic differential dynamic programming
Theodorou E, Tassa Y and Todorov E (2010). In American Control Conference

Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic systems
Li W and Todorov E (2007). International Journal of Control, 80: 1439-1453 

A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems
Todorov E and Li W (2005). In proceedings of the American Control Conference, pp 300-306



Path integrals, KL-control, and Linearly solvable control


Linearly-solvable optimal control
Dvijotham K and Todorov E (2012). In Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, Lewis (ed), chap. 6, Wiley and IEEE Press, in press 

Parallels between sensory and motor information processing
Todorov E (2008). In The Cognitive Neurosciences, 4th ed, Gazzaniga (ed), MIT Press 

General duality between optimal control and estimation
Todorov E (2008). In proceedings of the 47th IEEE Conference on Decision and Control, pp 4286 - 4292 

Inverse optimal control with linearly-solvable MDPs
Dvijotham K and Todorov E (2010). In International Conference on Machine Learning

Eigenfunction approximation methods for linearly-solvable optimal control problems
Todorov E (2009). In proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp 161 - 168 

Linearly-solvable Markov decision problems
Todorov E (2006). In Advances in Neural Information Processing Systems 19: 1369-1376, Scholkopf et al (eds), MIT Press



Floating based trajectory optimization


An integrated system for real-time model-predictive control of humanoid robots
Erez T, Lowrey K, Tassa Y, Kumar V, Kolev S and Todorov E (2013). In IEEE/RAS International Conference on Humanoid Robots [Movie]

Synthesis and stabilization of complex behaviors through online trajectory optimization
Tassa Y, Erez T and Todorov E (2012). In IEEE/RSJ International Conference on Intelligent Robots and Systems [Movie]

Iterative linearization methods for approximately optimal control and estimation of non-linear stochastic systems
Li W and Todorov E (2007). International Journal of Control, 80: 1439-1453

A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems
Todorov E and Li W (2005). In proceedings of the American Control Conference, pp 300-30

A direct method for trajectory optimization of rigid bodies through contact.
Michael Posa, Cecilia Cantu, and Russ Tedrake.
The International Journal of Robotics Research (IJRR), 33(1):69-81, January 2014.[ .avi]

M. Fallon, S. Kuindersma, S. Karumanchi, M. Antone, T. Schneider, H. Dai, C. Pérez D'Arpino, R. Deits, M. DiCicco, D. Fourie, T. Koolen, P. Marion, M. Posa, A. Valenzuela, K. Yu, J. Shah, K. Iagnemma, R. Tedrake, S. Teller. An Architecture for Online Affordance-based Perception and Whole-body Planning, May 2014. MIT CSAIL Technical Report 2014-003.


POMDPs


Journal of Artificial Intelligence Research, vol.13, pp. 33-94, 2000

Online Planning Algorithms for POMDPs. [pdf]
S. Ross, J. Pineau, S. Paquet & B. Chaib-draa
In Journal of Artificial Intelligence Research (JAIR), vol. 32, p. 663-704, 2008.

Monte-Carlo Planning in Large POMDPs
David Silver, Joel Veness
Neural Information Processing Systems (NIPS), 2010 pdf video and code

Cameron Browne, Edward Powley, Daniel Whitehouse, Simon Lucas, Peter I. Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis and Simon Colton

G. Shani, J. Pineau, R. Kaplow. "A survey of point-based POMDP solvers". Autonomous Agents and Multi-Agent Systems. 2012. [.pdf]


Graphical model representations of POMDPs


Marc Toussaint and Amos Storkey and Stefan Harmeling: Expectation-Maximization methods for solving (PO)MDPs. In Bayesian Time Series Models, 388-413, Cambridge University Press, 2011.

Pascal Poupart and Marc Toussaint and Tobias Lang: Analyzing and Escaping Local Optima in Planning as Inference for Partially Observable Domains. In European Conf. on Machine Learning (ECML 2011), 2011.

Marc Toussaint and Laurent Charlin and Pascal Poupart: Hierarchical POMDP Controller Optimization by Likelihood Maximization. In Uncertainty in Artificial Intelligence (UAI 2008), 562-570, AUAI Press, 2008.


Gaussian belief space dynamics


Sigma Hulls for Gaussian Belief Space Planning for Imprecise Articulated Robots amid Obstacles,
Alex Lee, Yan (Rocky) Duan, Sachin Patil, John Schulman, Zoe McCarthy, Jur van den Berg, Ken Goldberg, Pieter Abbeel.
In the proceedings of the 26th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013.

LQG-MP: Optimized Path Planning for Robots with Motion Uncertainty and Imperfect State Information,
Jur van den Berg, Pieter Abbeel, Ken Goldberg.
In the International Journal of Robotics Research (IJRR), first published on June 3, 2011 as doi:10.1177/0278364911406562.

Motion Planning with Sequential Convex Optimization and Convex Collision Checking,
John Schulman, Yan Duan, Jonathan Ho, Alex Lee, Ibrahim Awwal, Henry Bradlow, Jia Pan, Sachin Patil, Ken Goldberg, Pieter Abbeel.
In the International Journal of Robotics Research (IJRR), 2014

Gaussian Belief Space Planning with Discontinuities in Sensing Domains,
Sachin Patil, Yan Duan, John Schulman, Ken Goldberg, Pieter Abbeel.
In the proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2014.



Non-Gaussian belief space planning


Platt, R., Tedrake, R., Kaelbling, L., Lozano-Perez, T., Belief space planning assuming maximum likelihood observations, Proceedings of Robotics: Science and Systems 2010 (RSS), Zaragosa, Spain, June 27, 2010, (Slides from presentation).

Platt, R., Kaelbling, L., Lozano-Perez, T., Tedrake, R.Efficient planning in non-Gaussian belief spaces and its application to robot grasping, Proceedings of the International Symposium on Robotics Research, 2011. (Extended version available in CSAIL Tech Report MIT-CSAIL-TR-2011-039

Platt, R., Kaelbling, L., Lozano-Perez, T., Tedrake, R. Non-Gaussian Belief Space Planning: Correctness and Complexity, IEEE Int'l Conf. on Robotics and Automation, 2012. (The final version of the paper posted here fixes some errors that were present in the proofs in the submitted version.)

Platt, R. Convex receding horizon control in non-Gaussian belief space, Proceedings of the Workshop on the Algorithmic Foundations of Robotics, 2012.

A Scalable Method for Solving High-Dimensional Continuous POMDPs Using Local Approximation
T. Erez, W.D. Smart, Proceedings of the 26th Conference in Uncertainty in Artificial Intelligence (UAI), 2010.


Predictive state representations and spectral methods


Michael L. Littman, Richard S. Sutton, Satinder Singh. NIPS 2002

Satinder Singh, Michael R. James, Matthew R. Rudary. UAI 2004.

Reduced-Rank Hidden Markov Models. 
S. M. Siddiqi, B. Boots & G. J. Gordon. AIStats 2010

Hilbert Space Embeddings of Hidden Markov Models. 
L. Song, B. Boots, S. M. Siddiqi, G. J. Gordon & A. J. Smola ICML 2010

Closing the Learning-Planning Loop with Predictive State Representations. 
B. Boots, S. M. Siddiqi & G. J. Gordon. RSS 2010 (also a 2011 IJRR paper)

Hilbert Space Embeddings of Predictive State Representations. 
B. Boots, A. Gretton, & G. J. Gordon. UAI 2013

[Recent advances]

Comments