Lab: Programming Intelligent Agents
Unit 2: Intelligent Agents — Lab
Lab Overview
In this lab you will implement two intelligent agents — a simple reflex agent and a model-based reflex agent — in a Python grid-world simulation. Both agents navigate a 2D grid, cleaning dirty squares and avoiding walls. Because they share the same environment, sensors, and actuators, any performance difference comes entirely from their architectures.
By the end of the lab you will be able to:
-
Translate agent theory into working Python code
-
Observe firsthand how internal state (model-based) outperforms pure reaction (simple reflex) in a partially observable environment
-
Analyze agent behavior quantitatively using performance metrics
-
Write reflections that connect your implementation experience to the theoretical framework from Sections 2.1–2.4
Platform: Google Colab (no local installation required)
Estimated Time: 2–3 hours
Before starting the lab, make sure you can answer these questions from prior sections:
-
What is the difference between a simple reflex agent and a model-based agent? (Section 2.4)
-
What does "partially observable" mean? (Section 2.3)
-
How does internal state help an agent handle partial observability? (Section 2.4)
-
What is PEAS? How would you write a PEAS description for this lab’s environment? (Section 2.2)
The Grid World Environment
The lab uses a grid world — a classic AI benchmark environment that is simple enough to implement in one lab session but rich enough to demonstrate meaningful architectural differences.
Environment Description
The grid is a rectangular array of squares. Each square is either clean, dirty, or a wall. The agent occupies one square at a time and can move in four directions (up, down, left, right) or vacuum the current square.
| Symbol | Meaning |
|---|---|
White square |
Clean floor — no action needed |
Brown square |
Dirty floor — agent should vacuum |
Black square |
Wall — agent cannot enter |
Blue square (agent icon) |
Current agent location |
PEAS Description
Before writing code, define the environment formally:
| Component | Specification |
|---|---|
Performance |
Number of dirty squares cleaned per step (efficiency); total steps to clean all reachable dirty squares (completeness) |
Environment |
2D grid containing clean squares, dirty squares, and walls |
Actuators |
Movement (up, down, left, right); vacuum action; stay (do nothing) |
Sensors |
Current square status (clean or dirty); presence of walls in each cardinal direction |
Environment Classification
| Dimension | Classification | Reason |
|---|---|---|
Observable |
Partially Observable |
Agent only perceives its current square and adjacent walls, not the full grid |
Deterministic |
Deterministic |
Moving right always moves right if no wall is present |
Episodic |
Sequential |
Current movement choices affect which squares are reachable in the future |
Static |
Static |
The grid does not change while the agent is deciding |
Discrete |
Discrete |
Finite grid positions and a finite set of actions |
Agents |
Single-Agent |
One agent operating alone |
Getting Started
Setting Up Google Colab:
-
Download
Week2_Programming_Assignment.ipynbusing the link below -
Go to Google Colab
-
Click File → Upload notebook
-
Select the
.ipynbfile you downloaded -
Run the first cell (all
importstatements) by clicking the play button or pressing Shift+Enter -
Verify you see:
✓ All imports successful!
Week 2 Programming Assignment
Download the Jupyter notebook for this lab:
Requires: Python 3.8+, numpy, matplotlib (all pre-installed in Google Colab)
|
If you prefer to work locally, you can install the dependencies with |
Notebook Structure
The notebook is organized into six parts. Parts 1 and 4 are provided for you. You implement Parts 2, 3, and 5. Part 6 is an optional extension.
| Part | Topic | Your Task | Status |
|---|---|---|---|
1 |
GridEnvironment class and visualization |
Read and understand the code |
Provided |
2 |
Simple Reflex Agent |
Implement |
You implement |
3 |
Model-Based Reflex Agent |
Implement |
You implement |
4 |
Performance comparison |
Run provided comparison code; observe results |
Provided |
5 |
Analysis questions |
Write three short reflections |
You write |
6 |
Extensions (optional) |
Goal-based agent; multi-agent system |
Optional |
Part 2: Simple Reflex Agent
The simple reflex agent selects actions based only on the current percept — no memory of the past is used.
What you receive from percept:
-
percept.location— current grid position as(row, col) -
percept.is_dirty—Trueif current square is dirty -
percept.walls— dictionary{'up': bool, 'down': bool, 'left': bool, 'right': bool}
Implementation Logic:
-
If the current square is dirty (
percept.is_dirty == True): returnAction.VACUUM -
If not dirty, check if the preferred direction is blocked: if
percept.walls[preferred_direction]isFalse, return that direction -
If preferred direction is blocked, try alternative directions in order
-
If all directions are blocked, return
Action.STAY
The percept.walls dictionary uses string keys ('up', 'down', 'left', 'right'), but Action values are enum members (Action.UP, Action.DOWN, etc.).
You will need a mapping:
direction_map = {
Action.UP: 'up',
Action.DOWN: 'down',
Action.LEFT: 'left',
Action.RIGHT: 'right'
}
Use percept.walls[direction_map[Action.UP]] to check if moving up is blocked.
Part 3: Model-Based Reflex Agent
The model-based agent maintains internal state — a record of what it has observed — to make smarter decisions.
Internal state to implement:
-
self.visited— a set of grid positions the agent has visited -
self.known_clean— positions confirmed clean -
self.known_dirty— positions confirmed dirty
Two methods to implement:
update_state(percept) — called first on each step; updates the internal state based on the new percept.
select_action(percept) — selects the best action given the current percept and accumulated internal state.
Model-Based Logic:
-
Call
update_state(percept)to record the current location and its cleanliness status -
If current square is dirty: return
Action.VACUUM -
If not dirty: look for an adjacent unvisited square and move there (exploration strategy)
-
If all adjacent squares are visited: use the known map to navigate toward a known dirty square
-
If no dirty squares are known and all accessible squares are visited: return
Action.STAY
The key difference: when the simple reflex agent reaches a dead end (walls on all sides), it stays forever. The model-based agent can consult its internal map of visited squares and reason that it has already cleaned everything reachable — avoiding wasted steps.
Part 4: Running the Comparison
Part 4 is provided for you. After implementing Parts 2 and 3, run the comparison cells. You will see:
-
A side-by-side visualization of both agents navigating the same grid
-
A performance chart showing cleaning efficiency over time (dirty squares cleaned per step)
-
Summary statistics: total steps taken, dirty squares cleaned, and efficiency score
Expected results: The model-based agent should achieve higher efficiency on grids larger than 3×3. On a 5×5 grid, the simple reflex agent may miss dirty squares it passed by; the model-based agent will systematically navigate toward uncleaned areas.
Part 5: Analysis Questions
After running both agents, answer these three questions in the notebook. Write 3–5 sentences per question.
-
Architectural difference: In your own words, explain the key difference between the simple reflex agent and the model-based agent. What specific capability does the model-based agent have that the simple reflex agent lacks, and how did that show up in the performance results?
-
Environment connection: This environment is classified as partially observable and sequential. How did each property specifically affect agent performance? Would the simple reflex agent perform just as well in a fully observable environment?
-
Scaling to the real world: This grid world is much simpler than a real house or a real warehouse. Name two specific challenges you would encounter if you tried to deploy your simple reflex agent in a real Roomba-style vacuum robot, and explain how a more sophisticated architecture would address them.
Grading Criteria
| Component | Points | Criteria |
|---|---|---|
Simple Reflex Agent — correct implementation |
20 |
|
Model-Based Agent — |
20 |
Internal state (visited, known_clean, known_dirty) is correctly updated on each step |
Model-Based Agent — |
20 |
Uses internal state to prefer unvisited squares; navigates toward known dirty squares |
Comparison results — runs without errors |
10 |
Part 4 executes cleanly and produces visualization and metrics |
Analysis Question 1 |
10 |
Accurate explanation of architectural difference with connection to observed results |
Analysis Question 2 |
10 |
Correctly identifies how partial observability and sequential structure affected behavior |
Analysis Question 3 |
10 |
Two realistic challenges identified with plausible architectural solutions |
Total |
100 |
Submission Instructions
-
Complete all required parts of the notebook (Parts 2, 3, and 5)
-
Ensure Part 4 runs without errors (the comparison code should produce output)
-
Download your completed notebook: File → Download → Download .ipynb
-
Upload the
.ipynbfile to the Unit 2 Lab assignment in Brightspace -
Due date: see course schedule
Want to explore further?
The grid-world simulation in this lab is inspired by the classic "vacuum world" from the AI literature. For more complex agent implementations, explore:
- aima-python on GitHub
-
The official Python code companion to the Russell & Norvig AI textbook. Contains production-quality implementations of all four agent architectures. MIT License.
- Berkeley CS 188 Agents Chapter
-
The primary reference for this unit. Contains additional examples and exercises.
Lab code structure inspired by the aima-python agents module, MIT License, copyright 2016 aima-python contributors.
Based on the UC Berkeley CS 188 Online Textbook by Nikhil Sharma, Josh Hug, Jacky Liang, and Henry Zhu, licensed under CC BY-SA 4.0.
This work is licensed under CC BY-SA 4.0.