Pedagogy

Robotics intersects with many different fields (Mechanical Engineering, Control Engineering, Applied Mathematics, Physics, Machine Learning, Artificial Intelligence, and more); it brings together a diverse community to focuses on the very specific problem of autonomy. The sheer breadth of these subjects means that we require our robotics students absorb a very broad range of background knowledge in order to make a dent in the field.

These documents are a collection of references I've written on various topics directed specifically toward this audience. Since many references already exist on these topics, I try in these documents to normalize the information as much as possible to make it more broadly accessible. Roboticists come from diverse backgrounds, but most references are littered with jargon and assume a very specific technical audience. I try not to be loose on the details—correctness and completeness are paramount—but, in favor of intuition, visualization, connection, and synthesis, the level of rigor tends to fall more toward the side of the intuitive physicist than that of the meticulous mathematician. Please email me with any comments or corrections you may have.

Mathematics for Intelligent Systems. These documents cover a number of fundamental mathematical ideas and tools required for in-depth exploration of robotics, machine learning, optimization, and other intelligent systems. It start with a discussion of linear algebra from a geometric and coordinate-free viewpoint, and its relationship to advanced calculus and, in particular, the geometry of smooth mappings between spaces. It then moves into probability and statistical analysis, covering graphical models, some common statistical bounds, and decision theoretic frameworks.

Advanced Robotics: Analytical Dynamics, Optimal Control, and Inverse Optimal Control. These documents develop legged and floating robotics from the bottom up, starting with a study of the fundamental building blocks of control design, analytical dynamics, and continuing through to floating-based instantaneous control, optimal control, and imitation learning through inverse optimal control. I originally wrote them for the first half of a course I taught on Advanced Robotics at the University of Stuttgart, summer semester 2014.

Probabilistic inference, Gaussians, quadratics, and the Kalman filter. Gaussian distributions are perhaps the most important distribution you’ll ever encounter, not because they represent everything well (they’re usually, by themselves, poor approximations to complex systems), but be- cause they’re one of the only high-dimensional distributions we can handle analytically. And because of that, many approaches to more complicated problems revolve around reducing the problems to Gaussian approximations or sequences of Gaussian approximations to leverage the tractable algebra of Gaussian inference, which, to a large extent, itself reduces to linear algebra. That’s not to say Gaussian manipulations easy—in many ways the algebra can be tedious—but having a thorough understanding of their properties, their relation to quadratic functions and linear systems, and their manipulation in the context of probabilistic inference for such queries as the transition and observation updates of a Kalman filter is crucial for a strong foundation for further study within the uncertainty-laden domain of state-estimation, localization, mapping, and sensor processing in mobile robotics.

Information Geometry and Natural Gradients. This document reviews some of the basic concepts behind natural gradients. We start by introducing basic information theoretic concepts such as optimal codes, entropy, and the KL-divergence. We then demonstrate how the choice of metric on perturbations can significantly affect the performance of gradient descent algorithms. Within that discussion, we review the Method of Lagrange Multipliers for equality constrained optimization as well as intuitive interpretations of the action of positive definite matrices and their inverses and how that relates to their role in generalized gradient descent updates. Finally, we review how the second-order Taylor expansion of the KL-divergence between a distribution and a slightly perturbed version of that distribution leads to the notion of the Fisher Information as a natural metric on the manifold of probability distributions.