# 2010 Control reading group

Copied from Adroit Robotics page

**Purpose:** To study and develop the state-of-the-art in nonlinear control, particularly as it applies to motion planning. Below we list relevant literature including subtopics of control theory, functional gradients, optimization on Riemannian manifolds, and spectral algorithms for learning graphical models.

**Time/date:** Wednesdays 5-6pm Intel; the current paper is tagged as **[active]** below.

**Control Theory**

Emanuel Todorov. Optimal Control Theory. Bayesian Brain. MIT Press, 2006.

[*discussion*] Marc Toussaint, Christian Goerick (2010): A Bayesian view on motor control and planning. In Olivier Sigaud, Jan Peters (Eds.): *From motor to interaction learning in robots*, Springer, print expected in 2010.

Marc Toussaint (2009): Robot Trajectory Optimization using Approximate Inference. *25th International Conference on Machine Learning* (ICML 2009).

[*discussion*] J. V. D. Berg, P. Abbeel and K. Goldberg.* *LQG-MP: Optimized Path Planning for Robots with Motion Uncertainty and Imperfect State Information. RSS 2010

B. van den Broek, W. Wiegerinck, and H. J. Kappen. Graphical model inference in optimal control of stochastic multi-agent systems. Journal of Artificial Intelligence Research, 32 (1):95–122, 2008.

H. J. Kappen. Linear theory for control of nonlinear stochastic systems. Phys. Rev. Lett., 95:200201, Nov 2005a.

H. J. Kappen. Path integrals and symmetry breaking for optimal control theory. Journal of Statistical Mechanics: Theory and Experiment, (11):P11011, 2005b.

H. J. Kappen. An introduction to stochastic control theory, path integrals and reinforcement learning. In J. Marro, P. L. Garrido, and J. J. Torres, editors, Cooperative Behavior in Neural Systems, volume 887 of American Institute of Physics Conference Series, pages 149–181, February 2007.

H. J. Kappen, W. Wiegerinck, and B. van den Broek. A path integral approach to agent planning. In AAMAS, 2007.

H. J. Kappen, Gmez V., and Opper M. Optimal control as a graphical model inference problem. Journal for Machine Learning Research (JMLR), arXiv:0901.0633v, 2009.

J. Peters and S. Schaal. Learning to control in operational space. International Journal of Robotics Research, 27:197–212, 2008c.

Evangelos Theodorou, Jonas Buchli and Stefan Schaal. A Generalized path integral approach to reinforcement learning.*Journal of Machine Learning Research – JMLR 2010*

Theodorou E, Tassa Y and Todorov E. Stochastic differential dynamic programming. In *American Control Conference (2010).*

Todorov E *Implicit nonlinear complementarity: A new approach to contact dynamics** (2010). In International Conference on Robotics and Automation*

Todorov E (2009). *Compositionality of optimal control laws** In Advances in Neural Information Processing Systems 22, pp 1856-1864, Bengio et al (eds), MIT Press*

Todorov E (2009).* Eigenfunction approximation methods for linearly-solvable optimal control problems** In proceedings of the 2nd IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, pp 161 – 168*

**[active] **Todorov E and Li W (2005).* A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems**. In proceedings of the American Control Conference, pp 300-306*

Todorov E (2008). *General duality between optimal control and estimation**. In proceedings of the 47th IEEE Conference on Decision and Control, pp 4286 – 4292*

**Functional Gradients**

Nathan Ratliff. Learning to Search: Structured Prediction Techniques for Imitation Learning. ch 4,6. doctoral dissertation, tech. report CMU-RI-TR-09-19, Robotics Institute, Carnegie Mellon University, May, 2009

D. Munoz, J. A. Bagnell, N. Vandapel, M. Hebert. Contextual Classification with Functional Max-Margin Markov Networks. CVPR 2009

Friedman, J. H. “Greedy Function Approximation: A Gradient Boosting Machine.” (Feb. 1999a)

L. Mason , J. Baxter , P. Bartlett , M. Frean. Boosting Algorithms as Gradient descent. 2000.

**Optimization on Riemannian Manifolds**

Y. Yang. Globally Convergent Optimization Algorithms on Riemannian Manifolds: Uniform Framework fro Unconstrained and Constrained Optimization. Journal of Optimization Theory and Applications: Vol. 132, No. 2, pp. 245-265, February 2007

F. Alvarez, J. Bolte, O. Brahic. Hessian Riemannian Gradient Flows in Convex Programming. SIAM J. Control Optim. Vol. 43, No. 2, pp. 477-501. 2004

F. Alvarez, J. Lopez. Convergence to the optimal value for barrier methods combined with Hessian Riemannian gradient flows and generalized proximal algorithms. 2010

F. Alvarez, J. Bolte, J. Munier. Unifying Local Convergence Result for Newton’s Method in Riemannian Manifolds. 2004

C. Samir, P. Absil, A. Srivastava, E. Klassen. A Gradient-Descent Method for Curve Fitting on Riemannian Manifolds. Tech. Report UCL-INMA-2009.

C. Samir, P. Absil, A. Srivastava, E. Klassen. Fitting Curves on Riemannian Manifolds Using Energy Minimization. MVA2009 IAPR Conference on Machine Vision Applications, Japan, 2009

Christopher Baker’s webpage has numerous links: http://www.math.fsu.edu/~cbaker/GenRTR/?page=links

General optimization resource: http://www.optimization-online.org/

**Spectral methods for HMMs and PSRs**

D. Hsu, S. Kakade, T. Zhang. A spectral algorithm for learning hidden markov models. *COLT *2009

S. Siddiqi, B. Boots, G. Gordon. Reduced-Rank Hidden Markov Models. AISTATS 2010

B. Boots, G. Gordon. Predictive State Temporal Difference Learning. NIPS 2010. http://arxiv.org/abs/1011.0041

B. Boots, S. Siddiqi, G. Gordon. Closing the Learning-Planning Loop with Predictive State Representations. RSS 2010

A. Smola, A. Gretton, L. Song, B. Scholkopf. A Hilbert Space Embedding for Distributions. Algorithmic Learning Theory, 2007 – Springer

L. Song, J. Huang, A. Smola, K. Fukumizu. Hilbert Space Embeddings of Conditional Distributions with Applications to Dynamical Systems. ICML 2009.

L. Song, B. Boots, S. Siddiqi, G. Gordon, A. Smola. Hilbert space embeddings of hidden markov models. ICML 2010