Concurrent Multi-space Exploration
Toward high-dimensional imitation learning:
Developing intelligent heuristic for randomized motion planners

Rosen Diankov, Nathan Ratliff, David Ferguson, Siddhartha Srinivasa, James Kuffner

High-dimensional motion planning is plagued by the curse-of-dimensionality. Optimal planning becomes intractable in many commonly occurring high-dimensional configuration spaces, rendering our MMP imitation learning framework impractical. Such high-dimensional configuration spaces frequently occur in problems such as manipulation planning, quadupedal locomotion, and nonholonomic planning with dynamics. In these cases, planning paradigms must shift.
Typically, high-dimensional planners relax the optimality requirement and search stochastically for simply a collision-free feasible path through the configuration space. Recently, Rapidly-exploring Randomized Trees (RRTs) have become a workhorse in high-dimensional planning as a result of their wide applicability, implementational simplicity, and fast convergence across a range of practical problems (LaValle et al., 2001).
Unfortunately, characterizing the behavior of an RRT planner in terms of its parameters is difficult. Learning techniques for optimizing performance often resort to policy search reinforcement learning based strategies for blindly tuning planning parameters (Zucker et al., 2008 link). The BiSpace algorithm, depiected in the figure and demonstrated in the video above, makes steps toward addressing this problem while simultaneously easing the application of the RRT algorithm for a wide range of planning queries.


RRT planners are designed to planning between two well-defined points in the configuration space. However, many problems are most naturally described in terms of a general goal condition rather than a specific goal configuration. In the case of manipulation planning, for instance, the goal is a particular grasp configuration or a set of grasp configurations. Any arm and mobile-base configuration attaining one such grasp is considered a goal configuration. For these problems, there often exists a manifold of valid solutions to the planning query. While RRT algorithm typically undergoes a computationally intensive and over-constraining process of choosing one of these configurations arbitrarily as a unique planning goal, our BiSpace algorithm utilizes the full flexibility of the solution class by using a lower-dimensional RRT grown backward from the goal manifold as a heuristic. During the extension phase of RRT growth, the lower-dimensional heuristic RRT periodically attempts to guide the full forward searching RRT toward the goal using workspace control techniques such as Jacobian transpose or Jacobian pseudoinverse control. The above images depict the core components of the algorithm: it plans simultaneously forward through the configuration space and backward through the goal space while periodically extending toward the goal guided by the low-dimensional backward heuristic tree.

While RRTs themselves are difficult to train from expert demonstration, recent work in heuristic learning has shown that using planning-based heuristics can be an effective technique for transferring expert information to a planner (Ratliff et al, 2007a). The BiSpace planning algorithm strives toward imitation learning for RRTs by introducing intelligent heuristics to guide the fast discovery of collision-free feasible solutions.

This research was performed as part of the Personal Robotics project at Intel Research, Pittsburgh.