Agent Architectures

Unit 2: Intelligent Agents — Section 2.4

You know what an agent is, how to specify it with PEAS, and how to classify its environment. Now the central engineering question: how do you build the agent program?

The agent program is the internal mechanism that generates actions from percepts. Different problems call for different internal structures — different agent architectures. We examine four architectures in order of increasing sophistication.

From Section 2.3: environment properties determine which architecture is appropriate. A fully observable, deterministic environment can use a simple architecture. A partially observable, stochastic environment requires something more sophisticated. Keep the environment taxonomy in mind as you read each architecture below.

Architecture 1: Simple Reflex Agents

Simple Reflex Agent: An agent that selects actions based solely on the current percept, ignoring all prior history. It implements a set of condition-action rules: "IF this situation is observed, THEN take this action."

A simple reflex agent works like a collection of if-then rules. When a percept arrives, the agent scans its rules, finds the first rule whose condition matches, and executes the corresponding action.

Simple reflex agent architecture showing sensors to condition-action rules to actuators

Figure 2.4: The simple reflex agent — reacts directly to the current percept using condition-action rules without maintaining internal state.

Automatic Sliding Door

Rules:

IF person approaches   THEN open door
IF no person detected  THEN close door

The door does not remember whether it was open a minute ago — it reacts only to what its sensor tells it right now. This is exactly a simple reflex agent.

Advantages:

Very simple to implement and understand
Fast — no complex reasoning required
Works perfectly in fully observable, deterministic environments

Limitations:

Fails in partially observable environments (cannot account for what it cannot see)
No planning or anticipation of consequences
Can get stuck in infinite loops
Cannot handle novel situations not covered by its rules

Best suited for: Fully observable environments with simple, reactive tasks where speed is critical.

Architecture 2: Model-Based Reflex Agents

Model-Based Reflex Agent: An agent that maintains an internal state — a representation of parts of the world that are not currently visible — to handle partial observability. The internal state is updated using a transition model (how the world changes) and a sensor model (how world states map to percepts).

When a car passes behind a truck and disappears from a camera’s view, a simple reflex agent treats it as if the car never existed. A model-based agent maintains a belief: "There is probably still a car there, moving at roughly the speed I last observed."

Model-based reflex agent with internal state tracking world evolution

Figure 2.5: The model-based reflex agent — maintains an internal state to handle partially observable environments by tracking how the world evolves.

The agent updates this internal state on every time step using two models:

Transition model: How does the world change on its own (independent of my actions)?
Sensor model: Given the world’s true state, what percepts would I expect to receive?

Self-Driving Car Tracking Hidden Vehicles

Situation: A car passes behind a large delivery truck. The car is no longer visible to the cameras.

A simple reflex agent: ignores the car (it is not in the current percept).

A model-based agent: maintains internal belief that a car exists behind the truck, estimates its likely position and speed based on its last known trajectory, and adjusts driving behavior accordingly — leaving extra following distance, signaling cautiously.

The model-based agent’s internal state includes: {location: (behind truck), speed: ~55 mph, confidence: 0.85}.

Advantages:

Handles partial observability
Can track objects and states that are temporarily hidden
More robust than simple reflex agents

Limitations:

Still reactive at its core — uses rules, not planning
Requires an accurate model of how the world works
Uncertainty accumulates over time as the internal model diverges from reality

Best suited for: Partially observable environments where the agent can model world dynamics but does not need to plan sequences of actions.

Architecture 3: Goal-Based Agents

Goal-Based Agent: An agent that explicitly represents goals — descriptions of desirable states — and uses search or planning to find action sequences that achieve those goals. Rather than reactive rules, the agent reasons about what would happen if it took various sequences of actions.

The key difference from reflex agents: a goal-based agent reasons about the future. It asks "if I do X, then Y, then Z — will I reach my goal?" and chooses the action sequence with the best answer.

Goal-based agent architecture with search and planning components

Figure 2.6: The goal-based agent — reasons about the future by simulating sequences of actions to determine which ones achieve its goals.

GPS Navigation

Goal: Reach "123 Main Street"

Process: . Current state: at the intersection of 1st Ave and Oak St . Search through possible routes: "Turn right and go 2 blocks" vs. "Turn left and take the highway" vs. … . Evaluate: which sequence of turns leads to the goal state? . Select and execute: "Turn right, go 2 blocks, turn left…"

The GPS plans a sequence of actions rather than reacting to each step one at a time. If a road is closed, it replans — it does not blindly follow pre-programmed rules.

Advantages:

Flexible — change the goal and the agent automatically computes a new plan
Can handle novel situations not explicitly anticipated at design time
Reasons about the long-term consequences of actions
One architecture can pursue many different goals

Limitations:

Computationally expensive compared to reflex agents
Requires an accurate model of how actions change the world
Goal achievement is binary: goal met or not met (no "degrees of success")

Best suited for: Sequential environments where the agent must plan ahead, especially when goals change frequently or the action space is too complex for hand-coded rules.

Architecture 4: Utility-Based Agents

Utility-Based Agent: An agent that uses a utility function — a function that maps states (or sequences of states) to a numeric measure of desirability — and chooses actions that maximize expected utility. Utility provides a continuous, nuanced measure of success rather than binary goal achievement.

Goal-based agents treat success as binary: the goal is either achieved or it is not. But real problems often require trade-offs among multiple competing criteria. A utility function lets the agent balance these criteria numerically.

Utility-based agent architecture with utility function for probabilistic decision making

Figure 2.7: The utility-based agent — uses a mathematical utility function to evaluate and compare different possible outcomes under uncertainty.

Self-Driving Taxi Route Choice

A goal-based agent with goal "reach destination" might take the fastest route regardless of other factors.

A utility-based agent evaluates routes on multiple criteria simultaneously:

Route	Utility Score	Reasoning
Highway (fast, risky construction zone)	72	Speed: +40, Safety risk: -25, Fuel: +15, Comfort: +12, Traffic: +10 (net: weighted sum → 72)
Surface streets (slower, smooth roads)	85	Speed: +25, Safety: +30, Fuel: +12, Comfort: +18, Traffic: +0 (net: 85)
Expressway (medium speed, tolls)	78	Speed: +35, Safety: +25, Fuel: +8, Comfort: +15, Traffic: +5, Toll: -10 (net: 78)

Route

Utility Score

Reasoning

Highway (fast, risky construction zone)

Speed: +40, Safety risk: -25, Fuel: +15, Comfort: +12, Traffic: +10 (net: weighted sum → 72)

Surface streets (slower, smooth roads)

Speed: +25, Safety: +30, Fuel: +12, Comfort: +18, Traffic: +0 (net: 85)

Expressway (medium speed, tolls)

Speed: +35, Safety: +25, Fuel: +8, Comfort: +15, Traffic: +5, Toll: -10 (net: 78)

The agent selects the surface streets route (utility = 85) even though it is not the fastest option. The utility function captures the relative importance of safety, comfort, and efficiency in a single principled measure.

Advantages:

Handles conflicting objectives through principled trade-offs
Reasons rationally under uncertainty (stochastic environments)
Produces a continuous scale of "how good" — not just pass/fail
Provides a foundation for decision theory

Limitations:

Most computationally expensive architecture
Utility functions are difficult to specify correctly
Requires probability estimates for outcomes in stochastic environments
Misspecified utility functions can produce unintended behavior

Best suited for: Environments with multiple conflicting objectives, stochastic outcomes, or trade-offs between competing performance criteria.

Comparing the Four Architectures

Architecture	Uses History?	Plans Ahead?	Handles Uncertainty?	Best Environments
Simple Reflex	No	No	No	Fully observable, simple, reactive
Model-Based Reflex	Yes (internal state)	No	Limited	Partially observable, reactive
Goal-Based	Yes	Yes	Limited	Sequential, planning required
Utility-Based	Yes	Yes	Yes	Trade-offs, stochastic outcomes

Architecture

Uses History?

Plans Ahead?

Handles Uncertainty?

Best Environments

Simple Reflex

Fully observable, simple, reactive

Model-Based Reflex

Yes (internal state)

Limited

Partially observable, reactive

Goal-Based

Yes

Limited

Sequential, planning required

Utility-Based

Yes

Trade-offs, stochastic outcomes

Each architecture builds directly on the previous one:

Simple Reflex: Percept → Action (rules only)
Model-Based: Add internal state tracking to handle hidden information
Goal-Based: Add goals and planning to reason about the future
Utility-Based: Add a utility function to handle trade-offs and uncertainty

More sophisticated architecture = more powerful, but also more computationally expensive and harder to design. Choose the simplest architecture that handles your environment’s properties.

Matching Architecture to Environment

The environment taxonomy from Section 2.3 connects directly to architecture selection:

Fully observable + deterministic + static: A simple reflex agent may suffice.
Partially observable: You need at least a model-based agent with internal state.
Sequential: You need at least a goal-based agent that can plan ahead.
Stochastic with trade-offs: You need a utility-based agent.

For each scenario, identify the minimum architecture needed and explain your reasoning:

A robot that waters plants when a soil moisture sensor reads below a threshold
A navigation system for a hiking trail (static environment, user sets destination)
A stock trading algorithm that must balance risk, return, and liquidity
A rescue robot searching a damaged building for survivors

Think carefully about whether the environment is partially observable, whether sequences of decisions matter, and whether trade-offs are involved.

Test Your Understanding

Match agent architectures to the scenarios that require them.

Next: 2.5 Learning Agents →

Based on the UC Berkeley CS 188 Online Textbook by Nikhil Sharma, Josh Hug, Jacky Liang, and Henry Zhu, licensed under CC BY-SA 4.0.

This work is licensed under CC BY-SA 4.0.