Lab: Programming Intelligent Agents

Unit 2: Intelligent Agents — Lab

Lab Overview

In this lab you will implement two intelligent agents — a simple reflex agent and a model-based reflex agent — in a Python grid-world simulation. Both agents navigate a 2D grid, cleaning dirty squares and avoiding walls. Because they share the same environment, sensors, and actuators, any performance difference comes entirely from their architectures.

By the end of the lab you will be able to:

Translate agent theory into working Python code
Observe firsthand how internal state (model-based) outperforms pure reaction (simple reflex) in a partially observable environment
Analyze agent behavior quantitatively using performance metrics
Write reflections that connect your implementation experience to the theoretical framework from Sections 2.1–2.4

Platform: Google Colab (no local installation required)

Estimated Time: 2–3 hours

Before starting the lab, make sure you can answer these questions from prior sections:

What is the difference between a simple reflex agent and a model-based agent? (Section 2.4)
What does "partially observable" mean? (Section 2.3)
How does internal state help an agent handle partial observability? (Section 2.4)
What is PEAS? How would you write a PEAS description for this lab’s environment? (Section 2.2)

The Grid World Environment

The lab uses a grid world — a classic AI benchmark environment that is simple enough to implement in one lab session but rich enough to demonstrate meaningful architectural differences.

Environment Description

The grid is a rectangular array of squares. Each square is either clean, dirty, or a wall. The agent occupies one square at a time and can move in four directions (up, down, left, right) or vacuum the current square.

Symbol	Meaning
White square	Clean floor — no action needed
Brown square	Dirty floor — agent should vacuum
Black square	Wall — agent cannot enter
Blue square (agent icon)	Current agent location

Symbol

Meaning

White square

Clean floor — no action needed

Brown square

Dirty floor — agent should vacuum

Black square

Wall — agent cannot enter

Blue square (agent icon)

Current agent location

PEAS Description

Before writing code, define the environment formally:

Component	Specification
Performance	Number of dirty squares cleaned per step (efficiency); total steps to clean all reachable dirty squares (completeness)
Environment	2D grid containing clean squares, dirty squares, and walls
Actuators	Movement (up, down, left, right); vacuum action; stay (do nothing)
Sensors	Current square status (clean or dirty); presence of walls in each cardinal direction

Component

Specification

Performance

Number of dirty squares cleaned per step (efficiency); total steps to clean all reachable dirty squares (completeness)

Environment

2D grid containing clean squares, dirty squares, and walls

Actuators

Movement (up, down, left, right); vacuum action; stay (do nothing)

Sensors

Current square status (clean or dirty); presence of walls in each cardinal direction

Environment Classification

Dimension	Classification	Reason
Observable	Partially Observable	Agent only perceives its current square and adjacent walls, not the full grid
Deterministic	Deterministic	Moving right always moves right if no wall is present
Episodic	Sequential	Current movement choices affect which squares are reachable in the future
Static	Static	The grid does not change while the agent is deciding
Discrete	Discrete	Finite grid positions and a finite set of actions
Agents	Single-Agent	One agent operating alone

Dimension

Classification

Reason

Observable

Partially Observable

Agent only perceives its current square and adjacent walls, not the full grid

Deterministic

Moving right always moves right if no wall is present

Episodic

Sequential

Current movement choices affect which squares are reachable in the future

Static

The grid does not change while the agent is deciding

Discrete

Finite grid positions and a finite set of actions

Agents

Single-Agent

One agent operating alone

Getting Started

Setting Up Google Colab:

Download Week2_Programming_Assignment.ipynb using the link below
Go to Google Colab
Click File → Upload notebook
Select the .ipynb file you downloaded
Run the first cell (all import statements) by clicking the play button or pressing Shift+Enter
Verify you see: ✓ All imports successful!

Week 2 Programming Assignment

Download the Jupyter notebook for this lab:

Week2_Programming_Assignment.ipynb

Requires: Python 3.8+, numpy, matplotlib (all pre-installed in Google Colab)

If you prefer to work locally, you can install the dependencies with pip install numpy matplotlib jupyter and open the notebook with jupyter notebook.

Notebook Structure

The notebook is organized into six parts. Parts 1 and 4 are provided for you. You implement Parts 2, 3, and 5. Part 6 is an optional extension.

Part Topic Your Task Status

Part	Topic	Your Task	Status
1	GridEnvironment class and visualization	Read and understand the code	Provided
2	Simple Reflex Agent	Implement `select_action()`	You implement
3	Model-Based Reflex Agent	Implement `update_state()` and `select_action()`	You implement
4	Performance comparison	Run provided comparison code; observe results	Provided
5	Analysis questions	Write three short reflections	You write
6	Extensions (optional)	Goal-based agent; multi-agent system	Optional

GridEnvironment class and visualization

Read and understand the code

Provided

Simple Reflex Agent

Implement select_action()

You implement

Model-Based Reflex Agent

Implement update_state() and select_action()

You implement

Performance comparison

Run provided comparison code; observe results

Provided

Analysis questions

Write three short reflections

You write

Extensions (optional)

Goal-based agent; multi-agent system

Optional

Part 2: Simple Reflex Agent

The simple reflex agent selects actions based only on the current percept — no memory of the past is used.

What you receive from percept:

percept.location — current grid position as (row, col)
percept.is_dirty — True if current square is dirty
percept.walls — dictionary {'up': bool, 'down': bool, 'left': bool, 'right': bool}

Implementation Logic:

If the current square is dirty (percept.is_dirty == True): return Action.VACUUM
If not dirty, check if the preferred direction is blocked: if percept.walls[preferred_direction] is False, return that direction
If preferred direction is blocked, try alternative directions in order
If all directions are blocked, return Action.STAY

The percept.walls dictionary uses string keys ('up', 'down', 'left', 'right'), but Action values are enum members (Action.UP, Action.DOWN, etc.).

You will need a mapping:

direction_map = {
    Action.UP: 'up',
    Action.DOWN: 'down',
    Action.LEFT: 'left',
    Action.RIGHT: 'right'
}

Use percept.walls[direction_map[Action.UP]] to check if moving up is blocked.

Part 3: Model-Based Reflex Agent

The model-based agent maintains internal state — a record of what it has observed — to make smarter decisions.

Internal state to implement:

self.visited — a set of grid positions the agent has visited
self.known_clean — positions confirmed clean
self.known_dirty — positions confirmed dirty

Two methods to implement:

update_state(percept) — called first on each step; updates the internal state based on the new percept.

select_action(percept) — selects the best action given the current percept and accumulated internal state.

Model-Based Logic:

Call update_state(percept) to record the current location and its cleanliness status
If current square is dirty: return Action.VACUUM
If not dirty: look for an adjacent unvisited square and move there (exploration strategy)
If all adjacent squares are visited: use the known map to navigate toward a known dirty square
If no dirty squares are known and all accessible squares are visited: return Action.STAY

The key difference: when the simple reflex agent reaches a dead end (walls on all sides), it stays forever. The model-based agent can consult its internal map of visited squares and reason that it has already cleaned everything reachable — avoiding wasted steps.

Part 4: Running the Comparison

Part 4 is provided for you. After implementing Parts 2 and 3, run the comparison cells. You will see:

A side-by-side visualization of both agents navigating the same grid
A performance chart showing cleaning efficiency over time (dirty squares cleaned per step)
Summary statistics: total steps taken, dirty squares cleaned, and efficiency score

Expected results: The model-based agent should achieve higher efficiency on grids larger than 3×3. On a 5×5 grid, the simple reflex agent may miss dirty squares it passed by; the model-based agent will systematically navigate toward uncleaned areas.

Part 5: Analysis Questions

After running both agents, answer these three questions in the notebook. Write 3–5 sentences per question.

Architectural difference: In your own words, explain the key difference between the simple reflex agent and the model-based agent. What specific capability does the model-based agent have that the simple reflex agent lacks, and how did that show up in the performance results?
Environment connection: This environment is classified as partially observable and sequential. How did each property specifically affect agent performance? Would the simple reflex agent perform just as well in a fully observable environment?
Scaling to the real world: This grid world is much simpler than a real house or a real warehouse. Name two specific challenges you would encounter if you tried to deploy your simple reflex agent in a real Roomba-style vacuum robot, and explain how a more sophisticated architecture would address them.

Grading Criteria

Component Points Criteria

Component	Points	Criteria
Simple Reflex Agent — correct implementation	20	`select_action()` correctly checks dirty status, respects wall constraints, returns valid actions
Model-Based Agent — `update_state()` correct	20	Internal state (visited, known_clean, known_dirty) is correctly updated on each step
Model-Based Agent — `select_action()` correct	20	Uses internal state to prefer unvisited squares; navigates toward known dirty squares
Comparison results — runs without errors	10	Part 4 executes cleanly and produces visualization and metrics
Analysis Question 1	10	Accurate explanation of architectural difference with connection to observed results
Analysis Question 2	10	Correctly identifies how partial observability and sequential structure affected behavior
Analysis Question 3	10	Two realistic challenges identified with plausible architectural solutions
Total	100

Simple Reflex Agent — correct implementation

select_action() correctly checks dirty status, respects wall constraints, returns valid actions

Model-Based Agent — update_state() correct

Internal state (visited, known_clean, known_dirty) is correctly updated on each step

Model-Based Agent — select_action() correct

Uses internal state to prefer unvisited squares; navigates toward known dirty squares

Comparison results — runs without errors

Part 4 executes cleanly and produces visualization and metrics

Analysis Question 1

Accurate explanation of architectural difference with connection to observed results

Analysis Question 2

Correctly identifies how partial observability and sequential structure affected behavior

Analysis Question 3

Two realistic challenges identified with plausible architectural solutions

Total

100

Submission Instructions

Complete all required parts of the notebook (Parts 2, 3, and 5)
Ensure Part 4 runs without errors (the comparison code should produce output)
Download your completed notebook: File → Download → Download .ipynb
Upload the .ipynb file to the Unit 2 Lab assignment in Brightspace
Due date: see course schedule

Want to explore further?

The grid-world simulation in this lab is inspired by the classic "vacuum world" from the AI literature. For more complex agent implementations, explore:

aima-python on GitHub: The official Python code companion to the Russell & Norvig AI textbook. Contains production-quality implementations of all four agent architectures. MIT License.
Berkeley CS 188 Agents Chapter: The primary reference for this unit. Contains additional examples and exercises.

Next: 2.W Wrap-Up and Self-Assessment →

Based on the UC Berkeley CS 188 Online Textbook by Nikhil Sharma, Josh Hug, Jacky Liang, and Henry Zhu, licensed under CC BY-SA 4.0.

This work is licensed under CC BY-SA 4.0.