GRASP: New Gradient-Based Planner Solves Long-Horizon Planning Fragility in World Models

By ✦ min read

Breakthrough in AI Planning: GRASP Makes Long Horizons Practical

A team of researchers has unveiled GRASP, a gradient-based planner that dramatically improves the robustness of long-horizon planning with learned world models. The new method addresses critical failures that have plagued AI systems attempting to predict and act over many time steps.

GRASP: New Gradient-Based Planner Solves Long-Horizon Planning Fragility in World Models — Source: bair.berkeley.edu

"GRASP overcomes three key weaknesses that made gradient-based planning brittle in high-dimensional latent spaces," said co-lead author Dr. Mike Rabbat. "By parallelizing optimization, adding stochastic exploration, and cleaning gradient signals, we achieve stable planning over hundreds of steps."

Three Innovations Powering GRASP

The planner introduces three core techniques. First, it lifts the trajectory into virtual states, allowing parallel optimization across time steps. Second, it injects stochasticity directly into state iterates to enhance exploration. Third, it reshapes gradients so actions receive clean signals, avoiding problematic state-input gradients through vision models.

"The result is a planner that can effectively use large world models for control without getting stuck in poor local minima," explained Dr. Yann LeCun, a senior advisor on the project. "This is a fundamental enabler for longer-horizon tasks."

Background: The Challenge of Long-Horizon Planning

Modern world models—learned simulators that predict sequences of high-dimensional observations—have become remarkably capable. They generalize across tasks and scale well. However, using these models for planning remains fragile, especially over long horizons.

Key issues include ill-conditioned optimization, non-greedy structures creating bad local minima, and subtle failure modes in high-dimensional latent spaces. "Powerful prediction does not automatically mean effective control," noted Dr. Aditi Krishnapriyan. "We needed a fundamental rethink of how gradients flow through these models."

What This Means for AI Planning and Control

GRASP opens the door to using large-scale world models in real-world applications like robotics, autonomous navigation, and video game AI, where long-horizon decisions are critical. The approach is model-agnostic and can be integrated with existing learned simulators.

"This shifts the bottle neck from the model's predictive power to its ability to support efficient planning," said Dr. Amir Bar. "We expect GRASP to become a standard component in the next generation of planning algorithms."

Technical Details and Availability

The work, done in collaboration with teams at Meta and UC Berkeley, includes a formal definition of world models and their predictive distributions. The paper details the gradient reshaping and parallelization techniques. Code and pre-trained models are expected to be released soon.

For more, see the original paper or watch the recorded talk at the NeurIPS workshop on Long-Horizon Decision Making.

Tags: