Decision Theory and Expected Utility

Supplementary — Decision Theory and Expected Utility

So far in Unit 7, you have focused on inference — computing probabilities of hypotheses given evidence. But AI agents do not just reason; they act. Decision theory bridges the gap between probabilistic beliefs and rational action by adding one more ingredient: preferences.

Learn how AI agents combine probability and utility to make optimal decisions under uncertainty.

Beyond Belief: Making Decisions

Probabilities alone do not tell you what to do. Consider a doctor who has diagnosed:

60% probability of flu
30% probability of COVID-19
10% probability of bacterial infection

What should the doctor prescribe?

An antiviral for flu/COVID? (ineffective against bacteria)
An antibiotic for bacterial infection? (useless against viruses, has side effects)
Wait and see? (patient may get worse)

Notice that the right action depends not just on the probabilities, but also on the costs and benefits of being wrong in each direction.

Decision theory answers this question by introducing utility — a numerical measure of how much an agent values each possible outcome.

Utility: Quantifying Preferences

Utility: A numerical score assigned to an outcome that represents how desirable that outcome is to the agent. Higher utility means more preferred. Utility functions are subjective — different agents have different values.

Examples of Utility Functions

Money (diminishing returns):

Outcome	Utility
$0	0
$1,000	10
$10,000	50
$100,000	90

Each additional dollar provides less marginal utility — the tenth thousand dollars is worth less to you than the first thousand.

Health outcomes:

Outcome	Utility
Full health	100
Minor illness	70
Major illness	30
Death	0

Game outcomes:

U(win) = 1
U(draw) = 0
U(lose) = -1

Utilities are subjective. A risk-averse person assigns much higher disutility to losses than a risk-seeking person does. Decision theory does not tell you what to value — it tells you how to act consistently with your values.

Expected Utility: The Fundamental Principle

When outcomes are uncertain, we cannot know in advance which outcome will occur. Instead, we compute the expected utility of each action — the probability-weighted average of utilities across all possible outcomes.

Expected Utility Formula

EU(Action) = Σ P(Outcome | Action) × U(Outcome)

Sum over all possible outcomes, weighting each utility by its probability given the action.

Maximum Expected Utility (MEU): The principle that a rational agent should choose the action that maximizes expected utility. This is the foundation of rational decision-making under uncertainty.

Worked Example: To Test or Not to Test?

Medical Testing Decision

Situation: A patient might have a serious disease (10% prior probability).

Options:

Test: Costs money, 95% accurate
Don’t test: Free, but may miss disease

Utilities:

Healthy, no unnecessary treatment: U = 100
Sick, treated early (test caught it): U = 80
Sick, treated late (no test): U = 30
Healthy, unnecessary treatment (false positive): U = 70

Expected Utility of TESTING:

EU(Test) = P(Sick) × [P(+|Sick) × U(treat) + P(-|Sick) × U(miss)]
         + P(Healthy) × [P(+|Healthy) × U(false alarm) + P(-|Healthy) × U(correct negative)]

= 0.10 × [0.95 × 80 + 0.05 × 30]
  + 0.90 × [0.05 × 70 + 0.95 × 100]

= 0.10 × 77.5 + 0.90 × 98.5
= 7.75 + 88.65
= 96.4

Expected Utility of NOT TESTING:

EU(No Test) = P(Sick) × U(late treatment) + P(Healthy) × U(healthy)
= 0.10 × 30 + 0.90 × 100
= 3 + 90
= 93.0

Decision: TEST (EU = 96.4 > 93.0)

Even though testing has false positives and costs resources, catching the disease early provides enough value to make testing the rational choice.

Decision Networks (Influence Diagrams)

We can extend Bayesian networks to represent full decision problems by adding two new node types.

Decision Network (Influence Diagram)

A graphical model that combines three types of nodes:

Chance nodes (circles) — random variables with associated probabilities
Decision nodes (squares) — actions the agent can choose
Utility nodes (diamonds) — the numeric value of outcomes

The agent evaluates the expected utility of each combination of decisions to find the optimal action.

Umbrella Decision Network

Chance node: Weather (Rain or No Rain), P(Rain) = 0.4

Decision node: Take Umbrella? (Yes or No)

Utility node values:

Rain + Umbrella: U = 100 (dry!)
Rain + No Umbrella: U = 0 (wet!)
No Rain + Umbrella: U = 70 (carrying unnecessary weight)
No Rain + No Umbrella: U = 100 (perfect!)

Computing expected utilities:

EU(Take Umbrella) = 0.4 × 100 + 0.6 × 70 = 40 + 42 = 82
EU(No Umbrella)   = 0.4 × 0   + 0.6 × 100 = 0  + 60 = 60

Decision: Take umbrella. Even though it is inconvenient when dry, the risk of being caught in rain without one is worse.

Value of Information

How much is it worth to gather additional evidence before deciding?

Value of Information (VoI): The expected improvement in utility from observing an additional piece of evidence before making a decision.

VoI(Evidence) = EU(decision with evidence) − EU(decision without evidence)

VoI is never negative — a rational agent can always ignore information it does not find useful, so knowing more can only help or leave the agent equally well off.

Value of the Medical Test

From the earlier example:

EU(best action without test) = 93.0
EU(best action with test) = 96.4

VoI(Test) = 96.4 − 93.0 = 3.4 utility units

If the cost of testing is less than 3.4 utility units, the test is worthwhile.

Applications of Value of Information

Should we run a medical test before prescribing treatment?
Is market research worth the cost before a product launch?
Should a robot explore an unknown area before committing to a path?
When should an autonomous vehicle use an expensive sensor versus a cheaper one?

Real-World Applications

Decision theory under the MEU principle is used across many AI application domains:

Autonomous vehicles — Balance expected utility of safety, speed, and comfort when planning maneuvers. The cost of a collision is extremely high, so the vehicle takes wide safety margins even at some cost to speed.

Clinical decision support — Recommend diagnostic tests and treatments by weighing expected benefit against cost and side-effect risk.

Game AI — Choose moves that maximize expected score given uncertain opponent responses.

Robotics — Plan paths through uncertain terrain balancing speed, safety, and energy consumption.

Business analytics — Model investment decisions, pricing strategies, and product launch timing as expected-utility problems.

Key Takeaways

Decision theory extends probabilistic reasoning to rational action. By assigning utilities to outcomes and computing expected utilities, an agent can choose actions that are best given its beliefs and its values — even when outcomes are uncertain.

The Principle of Maximum Expected Utility (MEU) is the normative standard for rational agents: always choose the action with the highest expected utility.

Utility: A numerical measure of the desirability of an outcome; higher is better.
Expected Utility (EU): The probability-weighted average utility of an action: EU(a) = Σ P(outcome|a) × U(outcome).
Maximum Expected Utility (MEU): The principle that a rational agent chooses the action with the highest expected utility.
Decision Network: A graphical model (also called an influence diagram) that combines chance nodes, decision nodes, and utility nodes to represent and solve a decision problem.
Value of Information (VoI): The expected gain in utility from observing additional evidence before making a decision; always ≥ 0.

Back to Unit 7: Probabilistic Models →

Based on the UC Berkeley CS 188 Online Textbook by Nikhil Sharma, Josh Hug, Jacky Liang, and Henry Zhu, licensed under CC BY-SA 4.0.

This work is licensed under CC BY-SA 4.0.