Learning from Human Teleoperation
Predicting which foothold a human operator would choose next.
Nathan Ratliff, Joel Chestnutt, J. Andrew Bagnell

Early experiments demonstrated that our footstep cost functions assumed erroneously that the terrain would have a high friction coefficient. Unfortunately, when this assumption did not hold, the robot displayed a substantial degradation in performance due to slippage during execution. The relative ease of robot teleoperation allowed us to demonstrate robust solutions with footholds qualitatively different from those found by the automated footstep planner. We utilized LEARCH optimization under the MMP framework to generalize this demonstrated behavior in two ways. Initially, we modeled the cost of a step using not only features of the terrain, but also features of the action. This combination of features enabled us to both interpret the surrounding terrain and to encode constraints on the kinematics of the robot. Following this approach, we trained a next foothold prediction policy which was stable enough to greedily generate a sequence of footsteps across rugged terrain given a fixed foot order sequence (Ratliff et al., 2007d). Additionally, we demonstrated that although the learned predictor was trained using examples of rugged terrain, it learned a robust action model that could successfully traverse flat terrain at an even cadence. The videos below demonstrate the learning process and the performance of the learned footstep prediction policy across rugged terrain.