Lab: Programming Intelligent Agents

Unit 2: Intelligent Agents — Lab

Lab Overview

In this lab you will implement two intelligent agents — a simple reflex agent and a model-based reflex agent — in a Python grid-world simulation. Both agents navigate a 2D grid, cleaning dirty squares and avoiding walls. Because they share the same environment, sensors, and actuators, any performance difference comes entirely from their architectures.

By the end of the lab you will be able to:

  1. Translate agent theory into working Python code

  2. Observe firsthand how internal state (model-based) outperforms pure reaction (simple reflex) in a partially observable environment

  3. Analyze agent behavior quantitatively using performance metrics

  4. Write reflections that connect your implementation experience to the theoretical framework from Sections 2.1–2.4

Platform: Google Colab (no local installation required)

Estimated Time: 2–3 hours

Before starting the lab, make sure you can answer these questions from prior sections:

  • What is the difference between a simple reflex agent and a model-based agent? (Section 2.4)

  • What does "partially observable" mean? (Section 2.3)

  • How does internal state help an agent handle partial observability? (Section 2.4)

  • What is PEAS? How would you write a PEAS description for this lab’s environment? (Section 2.2)

The Grid World Environment

The lab uses a grid world — a classic AI benchmark environment that is simple enough to implement in one lab session but rich enough to demonstrate meaningful architectural differences.

Environment Description

The grid is a rectangular array of squares. Each square is either clean, dirty, or a wall. The agent occupies one square at a time and can move in four directions (up, down, left, right) or vacuum the current square.

Symbol Meaning

White square

Clean floor — no action needed

Brown square

Dirty floor — agent should vacuum

Black square

Wall — agent cannot enter

Blue square (agent icon)

Current agent location

PEAS Description

Before writing code, define the environment formally:

Component Specification

Performance

Number of dirty squares cleaned per step (efficiency); total steps to clean all reachable dirty squares (completeness)

Environment

2D grid containing clean squares, dirty squares, and walls

Actuators

Movement (up, down, left, right); vacuum action; stay (do nothing)

Sensors

Current square status (clean or dirty); presence of walls in each cardinal direction

Environment Classification

Dimension Classification Reason

Observable

Partially Observable

Agent only perceives its current square and adjacent walls, not the full grid

Deterministic

Deterministic

Moving right always moves right if no wall is present

Episodic

Sequential

Current movement choices affect which squares are reachable in the future

Static

Static

The grid does not change while the agent is deciding

Discrete

Discrete

Finite grid positions and a finite set of actions

Agents

Single-Agent

One agent operating alone

Getting Started

Setting Up Google Colab:

  1. Download Week2_Programming_Assignment.ipynb using the link below

  2. Go to Google Colab

  3. Click File → Upload notebook

  4. Select the .ipynb file you downloaded

  5. Run the first cell (all import statements) by clicking the play button or pressing Shift+Enter

  6. Verify you see: ✓ All imports successful!

Week 2 Programming Assignment

Download the Jupyter notebook for this lab:

Requires: Python 3.8+, numpy, matplotlib (all pre-installed in Google Colab)

If you prefer to work locally, you can install the dependencies with pip install numpy matplotlib jupyter and open the notebook with jupyter notebook.

Notebook Structure

The notebook is organized into six parts. Parts 1 and 4 are provided for you. You implement Parts 2, 3, and 5. Part 6 is an optional extension.

Part Topic Your Task Status

1

GridEnvironment class and visualization

Read and understand the code

Provided

2

Simple Reflex Agent

Implement select_action()

You implement

3

Model-Based Reflex Agent

Implement update_state() and select_action()

You implement

4

Performance comparison

Run provided comparison code; observe results

Provided

5

Analysis questions

Write three short reflections

You write

6

Extensions (optional)

Goal-based agent; multi-agent system

Optional

Part 2: Simple Reflex Agent

The simple reflex agent selects actions based only on the current percept — no memory of the past is used.

What you receive from percept:

  • percept.location — current grid position as (row, col)

  • percept.is_dirtyTrue if current square is dirty

  • percept.walls — dictionary {'up': bool, 'down': bool, 'left': bool, 'right': bool}

Implementation Logic:

  1. If the current square is dirty (percept.is_dirty == True): return Action.VACUUM

  2. If not dirty, check if the preferred direction is blocked: if percept.walls[preferred_direction] is False, return that direction

  3. If preferred direction is blocked, try alternative directions in order

  4. If all directions are blocked, return Action.STAY

The percept.walls dictionary uses string keys ('up', 'down', 'left', 'right'), but Action values are enum members (Action.UP, Action.DOWN, etc.).

You will need a mapping:

direction_map = {
    Action.UP: 'up',
    Action.DOWN: 'down',
    Action.LEFT: 'left',
    Action.RIGHT: 'right'
}

Use percept.walls[direction_map[Action.UP]] to check if moving up is blocked.

Part 3: Model-Based Reflex Agent

The model-based agent maintains internal state — a record of what it has observed — to make smarter decisions.

Internal state to implement:

  • self.visited — a set of grid positions the agent has visited

  • self.known_clean — positions confirmed clean

  • self.known_dirty — positions confirmed dirty

Two methods to implement:

update_state(percept) — called first on each step; updates the internal state based on the new percept.

select_action(percept) — selects the best action given the current percept and accumulated internal state.

Model-Based Logic:

  1. Call update_state(percept) to record the current location and its cleanliness status

  2. If current square is dirty: return Action.VACUUM

  3. If not dirty: look for an adjacent unvisited square and move there (exploration strategy)

  4. If all adjacent squares are visited: use the known map to navigate toward a known dirty square

  5. If no dirty squares are known and all accessible squares are visited: return Action.STAY

The key difference: when the simple reflex agent reaches a dead end (walls on all sides), it stays forever. The model-based agent can consult its internal map of visited squares and reason that it has already cleaned everything reachable — avoiding wasted steps.

Part 4: Running the Comparison

Part 4 is provided for you. After implementing Parts 2 and 3, run the comparison cells. You will see:

  • A side-by-side visualization of both agents navigating the same grid

  • A performance chart showing cleaning efficiency over time (dirty squares cleaned per step)

  • Summary statistics: total steps taken, dirty squares cleaned, and efficiency score

Expected results: The model-based agent should achieve higher efficiency on grids larger than 3×3. On a 5×5 grid, the simple reflex agent may miss dirty squares it passed by; the model-based agent will systematically navigate toward uncleaned areas.

Part 5: Analysis Questions

After running both agents, answer these three questions in the notebook. Write 3–5 sentences per question.

  1. Architectural difference: In your own words, explain the key difference between the simple reflex agent and the model-based agent. What specific capability does the model-based agent have that the simple reflex agent lacks, and how did that show up in the performance results?

  2. Environment connection: This environment is classified as partially observable and sequential. How did each property specifically affect agent performance? Would the simple reflex agent perform just as well in a fully observable environment?

  3. Scaling to the real world: This grid world is much simpler than a real house or a real warehouse. Name two specific challenges you would encounter if you tried to deploy your simple reflex agent in a real Roomba-style vacuum robot, and explain how a more sophisticated architecture would address them.

Grading Criteria

Component Points Criteria

Simple Reflex Agent — correct implementation

20

select_action() correctly checks dirty status, respects wall constraints, returns valid actions

Model-Based Agent — update_state() correct

20

Internal state (visited, known_clean, known_dirty) is correctly updated on each step

Model-Based Agent — select_action() correct

20

Uses internal state to prefer unvisited squares; navigates toward known dirty squares

Comparison results — runs without errors

10

Part 4 executes cleanly and produces visualization and metrics

Analysis Question 1

10

Accurate explanation of architectural difference with connection to observed results

Analysis Question 2

10

Correctly identifies how partial observability and sequential structure affected behavior

Analysis Question 3

10

Two realistic challenges identified with plausible architectural solutions

Total

100

Submission Instructions

  1. Complete all required parts of the notebook (Parts 2, 3, and 5)

  2. Ensure Part 4 runs without errors (the comparison code should produce output)

  3. Download your completed notebook: File → Download → Download .ipynb

  4. Upload the .ipynb file to the Unit 2 Lab assignment in Brightspace

  5. Due date: see course schedule

Want to explore further?

The grid-world simulation in this lab is inspired by the classic "vacuum world" from the AI literature. For more complex agent implementations, explore:

aima-python on GitHub

The official Python code companion to the Russell & Norvig AI textbook. Contains production-quality implementations of all four agent architectures. MIT License.

Berkeley CS 188 Agents Chapter

The primary reference for this unit. Contains additional examples and exercises.


Lab code structure inspired by the aima-python agents module, MIT License, copyright 2016 aima-python contributors.

Based on the UC Berkeley CS 188 Online Textbook by Nikhil Sharma, Josh Hug, Jacky Liang, and Henry Zhu, licensed under CC BY-SA 4.0.

This work is licensed under CC BY-SA 4.0.