Designing Good Heuristics

Unit 3: Search Techniques for Problem Solving — Section 3.5

A* with a perfect heuristic would expand only nodes on the optimal path — no wasted work. A* with a zero heuristic degrades to uniform-cost search. The difference between these extremes is the quality of h(n). This section teaches you what makes a heuristic "good" and a systematic method for designing one.

From Section 3.4: A heuristic h(n) is admissible if h(n) ≤ h*(n) for all nodes n, where h*(n) is the true optimal cost from n to the goal. Admissibility is the minimum requirement for A* to guarantee an optimal solution.

Four Properties of a Good Heuristic

Property Definition Why It Matters

Admissible

h(n) ≤ h*(n) always

Required for A* optimality. Without this, A* may find a suboptimal solution.

Consistent

h(n) ≤ c(n, a, n') + h(n') for all edges

Ensures f(n) is non-decreasing along every path. A* never needs to reopen an expanded node.

Dominant

h₂(n) ≥ h₁(n) for all n (both admissible)

A* with h₂ expands fewer nodes — fewer states explored, faster search.

Fast to compute

h(n) evaluates in O(1) or at most O(polynomial)

A slow heuristic can cost more than the search time it saves.

Admissibility in Depth

Admissibility means the heuristic is optimistic — it never thinks the goal is farther than it really is.

Testing admissibility for the 8-Puzzle:

Goal configuration: tiles 1–8 in order, blank bottom-right.

h₁ = number of misplaced tiles. Each misplaced tile needs at least one move to reach its target. We cannot reach the goal in fewer moves than the number of misplaced tiles. Therefore h₁ ≤ true cost. Admissible.

h₂ = sum of Manhattan distances of all tiles from their goal positions. Manhattan distance for each tile is the shortest number of moves if no other tile blocked its path. Actual cost ≥ sum of individual Manhattan distances. Therefore h₂ ≤ true cost. Admissible.

h₃ = 2 × Manhattan distance. This doubles the estimate. For a puzzle close to the goal, h₃ may exceed the true cost. Not admissible — A* with h₃ may skip the optimal path.

Admissible Heuristic

h(n) ≤ h*(n) for all n. The heuristic is never an overestimate. Admissible heuristics are optimistic about how close the goal is.

Consistency (Monotonicity)

Consistency is a stronger condition than admissibility.

Consistent (Monotone) Heuristic

A heuristic is consistent if for every node n and every action a leading to successor n':

h(n) ≤ c(n, a, n') + h(n')

where c(n, a, n') is the step cost. This is the triangle inequality applied to heuristic values.

Intuitively: the estimated remaining cost from n cannot exceed the cost of the step to n' plus the estimated remaining cost from n'. If it did, that would mean the heuristic is "jumping" from a lower estimate to a higher one while moving toward the goal — an inconsistency.

Key result: Consistency implies admissibility. Every consistent heuristic is also admissible, but not vice versa.

Practical benefit of consistency: When h is consistent, f(n) = g(n) + h(n) is non-decreasing along every path. This means A* with graph search never needs to update a node already in the explored set, making the algorithm more efficient. Most well-designed heuristics used in practice are consistent.

Dominance

If you have two admissible heuristics, which should you use?

Dominance

h₂ dominates h₁ if h₂(n) ≥ h₁(n) for all nodes n, and both heuristics are admissible. A* with the dominant heuristic (h₂) will expand no more nodes than A* with h₁, and typically many fewer.

Dominance in the 8-Puzzle:

For a typical mid-game state:

  • h₁ (misplaced tiles) = 5

  • h₂ (Manhattan distance) = 12

Since h₂ ≥ h₁ always, h₂ dominates h₁.

Result: A* with h₁ typically expands ~10,000 nodes; A* with h₂ expands ~1,000 nodes — a 10× speedup from the better heuristic alone.

Rule: Given two admissible heuristics, always use the one with larger values. Larger admissible values mean better-informed guidance.

Combining heuristics: If you have multiple admissible heuristics h₁, h₂, …​, hₖ and none dominates the others uniformly, you can define: h(n) = max(h₁(n), h₂(n), …​, hₖ(n))

This combined heuristic is still admissible and dominates each individual one.

The Relaxed-Problem Technique

How do you design an admissible heuristic in the first place? The most reliable method is to solve a relaxed version of the problem.

Relaxed Problem

A problem derived from the original by removing one or more constraints. Because the relaxed problem has fewer restrictions, its optimal solution cost is ≤ the original optimal cost. The optimal cost of the relaxed problem is therefore an admissible heuristic for the original.

Designing a heuristic via relaxation:

  1. Write out the original problem’s constraints explicitly

  2. Remove one or more constraints to make the problem easier

  3. Find the optimal solution to the relaxed problem (usually a simple formula)

  4. Use that cost as h(n)

  5. Verify: does the relaxed optimal ≤ true cost? (It must, by construction.)

Deriving 8-Puzzle heuristics via relaxation:

Original constraint: A tile can slide into an adjacent cell only if that cell is blank.

Relaxation 1: Remove the "only if blank" condition. Now tiles can move to any adjacent cell. Optimal cost for each tile = its Manhattan distance. Sum over all tiles → h₂ = Manhattan distance. Admissible.

Relaxation 2: Remove both the "adjacent" and "only if blank" conditions. Now tiles can teleport directly to their goal positions. Optimal cost for each tile = 1 move (or 0 if already correct). Sum = number of misplaced tiles → h₁. Admissible.

Both heuristics arise naturally from relaxing the original rules.

Route-finding heuristic via relaxation:

Original constraint: Travel must follow actual roads.

Relaxation: Remove roads — travel in a straight line.

Relaxed optimal cost: Euclidean (straight-line) distance.

Since actual road distance ≥ straight-line distance, the straight-line heuristic is admissible. This is exactly the heuristic used by GPS systems!

Pattern Databases (Advanced)

For complex combinatorial puzzles, precomputed lookup tables can provide extremely informed heuristics.

A pattern database stores the exact optimal cost for every possible configuration of a subset of the puzzle elements. At runtime, A* looks up the cost from the database in O(1).

Pattern database for the 15-Puzzle (4×4, 15 tiles):

Precompute optimal costs for all configurations of tiles 1–7 (ignoring tiles 8–15). Store in a table indexed by the positions of tiles 1–7.

At runtime: h(n) = database lookup for the positions of tiles 1–7.

Result: A* with pattern database solves the 15-Puzzle in milliseconds — problems that were previously considered intractable.

Heuristic Quality Summary

Heuristic Admissible Consistent Notes

h(n) = 0 (zero heuristic)

Yes

Yes

Reduces A* to UCS; legal but uninformative

h₁: misplaced tiles (8-puzzle)

Yes

Yes

Simple; moderate quality

h₂: Manhattan distance (8-puzzle)

Yes

Yes

Dominates h₁; strong in practice

h = straight-line distance (routing)

Yes

Yes

Standard GPS heuristic

h = 2 × Manhattan distance

No

No

Overestimates; breaks optimality

Heuristic design is an engineering art. The goal is to find the tightest lower bound on the true cost that can be computed quickly. The relaxed-problem method gives you a principled, systematic way to create admissible heuristics for any problem where you can write down the constraints.

Consider a robot navigating a grid where it can move horizontally, vertically, and diagonally.

  • Is Manhattan distance still admissible as a heuristic?

  • What would the correct relaxed-problem heuristic be for this movement model?

(Hint: think about the minimum number of moves to reach a target cell when diagonal moves are free.)

Test your understanding of heuristic admissibility, consistency, and dominance.


Based on the UC Berkeley CS 188 Online Textbook by Nikhil Sharma, Josh Hug, Jacky Liang, and Henry Zhu, licensed under CC BY-SA 4.0.

This work is licensed under CC BY-SA 4.0.