Heuristic learning

Planners Training Planners

Extracting the navigational path of a high-dimensional footstep plan

Nathan Ratliff, Joel Chestnutt, J. Andrew Bagnell

Successfully reacting to terrain uncertainty and execution error requires fast re-planning capabilities. MMP with LEARCH optimization gave us a 120x speedup in high-dimensional bipedal footstep planning by allowing us to train a navigational A* planner to act as an efficient well-informed heuristic. This heuristic predicts the high-level navigational path that would be traced out by an optimal sequence of footsteps (i.e. such as the sequence recovered by a higher-dimensional admissible A* footstep planner executed between the start and goal configurations). The image below depicts the data collection process for quadrupedal locomotion. The red line overlying the sequence of footsteps found by the footstep planner, shown here as multi-colored squares, demarcates the high-level navigational trajectory used for training. In essence, the high-dimensional footstep planner trains a low-dimensional navigational planner to act as its heuristic.

The video above shows the LittleDog robot executing a plan generated under a quadrupedal variant of the learned heuristic. Initial results published in (Ratliff et al., 2007a link) where produced by first training an A* planner to imitate the high-level motion of the robot’s body, and then scaling the resulting plan costs to give a heuristic value. In later work (Ratliff et al., 2007c link), we refined our results using a novel structured regression approach. Rather than training the navigational planner to first reproduce the desired trajectory, our new approach let us explicitly train the heuristic to predict the cost-to-go values directly from a particular four-foot configuration while still utilizing a two-dimensional navigational A* planner in the inner loop.