From "Is It True?" to "How Likely?"
Unit 7: Probability and Uncertainty in AI — Section 7.3
In Units 5 and 6, every statement in the agent’s knowledge base was either true or false. Resolution proved conclusions with certainty. Forward chaining and backward chaining drew definite inferences. Now we are asking AI to operate in domains where certainty is the exception, not the rule. This section explores the conceptual shift this requires — not just a change in mathematical tools, but a change in how we think about what an AI "knows."
In Unit 5 you learned that a proposition is a statement with a definite truth value: "It is raining" is either TRUE or FALSE. In Unit 6 you used modus ponens and resolution to chain facts together into guaranteed conclusions. Keep that framework in mind as you read this section. Probability does not replace logic — it generalizes it for situations where we cannot know the truth value for certain.
Two Ways of Asking the Same Question
Consider the question: "Does this patient have influenza?"
A logical agent answers: "TRUE" or "FALSE" — and if it cannot determine which, it simply has no answer.
A probabilistic agent answers: "There is a 73% chance the patient has influenza given their symptoms and test results."
Neither answer is wrong in its own framework. But notice which answer is useful for a doctor deciding whether to prescribe antiviral medication. The probabilistic answer guides action even under uncertainty. The logical answer is useful only if we are certain — which in medicine, we almost never are.
Logic is a closed-world system: what is not known to be true is assumed to be false (the Closed World Assumption). Probability is an open-world system: what is not known may be true with some positive probability.
Real environments are open worlds. This is why probabilistic reasoning is essential for AI that interacts with the real world.
Side-by-Side: Logic and Probability on the Same Scenario
The contrast between logical and probabilistic reasoning becomes clearest when we apply both frameworks to identical scenarios.
Scenario 1: Medical Diagnosis
Given: A patient has fever and cough.
| Logic-Based Reasoning | Probabilistic Reasoning |
|---|---|
Facts: |
Evidence: |
Rules: |
Prior: |
Conclusion: |
Model: |
Problem: same rules also conclude COVID, strep… |
Posterior: |
Cannot rank or choose between conclusions |
Conclusion: 48% chance of flu — actionable ranking |
The probabilistic agent cannot prove the patient has flu, but it can tell the doctor that flu is the most likely single explanation — and how confident to be. That is enough to guide the decision "should I prescribe Tamiflu?"
Scenario 2: Spam Filtering
Given: An email contains the word "FREE" in the subject line.
| Logic-Based Reasoning | Probabilistic Reasoning |
|---|---|
Rule: |
Prior: |
Conclusion: |
Likelihood: |
Problem: "FREE" also appears in legitimate emails ("Free parking validation attached") |
False positive: |
Blocks legitimate email — unacceptable |
Posterior: |
No way to tune the tradeoff |
Threshold: flag if P(Spam) > 0.90 → this email passes |
The probabilistic filter correctly identifies this specific email as not quite meeting the spam threshold, even though "FREE" is suspicious. The logical rule would block it outright.
Scenario 3: Weather Forecasting
Given: The barometric pressure dropped 10 mbar in the last 6 hours.
| Logic-Based Reasoning | Probabilistic Reasoning |
|---|---|
Rule: |
Prior: |
Conclusion: |
Update: pressure drop increases P(Rain) to 0.68 |
Cannot express "rain is more likely but not certain" |
Forecast: "68% chance of rain tomorrow" |
All-or-nothing forecast is often wrong |
Gradual update tracks real atmospheric complexity |
The three scenarios above all show the same pattern: logic concludes too strongly from weak evidence, while probability provides a calibrated conclusion.
Think of a fourth scenario from everyday life (not medical, not email, not weather). Describe it in the two-column format: what would a logical system conclude, and what would a probabilistic system conclude? Which answer would be more useful?
What Probability Adds to the Toolkit
Probability is not simply "logic with fuzziness." It is a complete alternative framework for representing knowledge that offers several capabilities logic cannot provide:
1. Expressing Prior Knowledge
Before any evidence is observed, a probabilistic agent can encode prior probabilities — background rates, base frequencies, and domain knowledge.
A doctor knows that, in the general population, influenza is more common than exotic tropical diseases. A spam filter knows that, historically, about 30% of all email is spam. A self-driving car knows that pedestrians are more likely to appear at crosswalks than in the middle of a highway.
Logic has no equivalent of priors. You either have a rule or you do not.
2. Updating Beliefs Incrementally
As evidence accumulates, a probabilistic agent updates its beliefs using Bayes' theorem (Section 7.4). Each new observation nudges the probability estimate up or down — a natural model of how rational beings actually learn.
A logical agent, by contrast, either derives a conclusion or it does not. There is no notion of "mostly convinced and getting more so with each additional clue."
3. Ranking and Choosing Under Uncertainty
When multiple hypotheses are possible, probability provides a ranking: hypothesis A is most probable, followed by B, then C. An agent can choose the most probable hypothesis or, in decision-theoretic contexts, can compute the expected utility of each possible action and choose the action with the highest expected payoff.
Logic provides no basis for choosing among equally consistent conclusions.
4. Graceful Degradation
When a probabilistic model receives conflicting evidence, the conflicting factors pull probabilities in opposite directions — the model finds a middle ground.
When a logical agent receives a contradiction (both P and ¬P are derivable), the entire knowledge base becomes inconsistent and anything can be derived (the principle of explosion).
- Probability Distribution
-
A complete specification of the probabilities of all possible values of a random variable (or all possible combinations of values for multiple variables). A distribution always sums (or integrates, for continuous variables) to 1.
- Prior Probability
-
The probability of a hypothesis before observing any evidence. Represents background knowledge or historical frequency. Written P(H).
- Posterior Probability
-
The probability of a hypothesis after incorporating observed evidence. Written P(H | E). The posterior is always computed from the prior using Bayes' theorem.
The Full Picture: Logic as a Special Case of Probability
There is a beautiful relationship between logic and probability that is worth understanding.
Logical truth is probability 1: if a statement is a theorem of a knowledge base, its probability is 1. Logical falsehood is probability 0: if a statement is provably false, its probability is 0. Every logical rule "IF A THEN B" corresponds to the probabilistic statement P(B | A) = 1.
Probability subsumes logic: every logically certain fact is a special case of a probabilistic fact with degree of belief = 1. But probability can also represent all the messy, uncertain, real-world knowledge that logic cannot capture.
Logic and probability are not competitors. Logic is the special case of probability where all degrees of belief are 0 or 1. For idealized, fully-observable, deterministic worlds, logic suffices. For the real world — with noise, incomplete information, and inherent randomness — probability is the right tool. Modern AI uses logic for what logic is good at (formal constraints, symbolic reasoning) and probability for what probability is good at (uncertain inference from data).
Preparing for Bayes' Theorem
The remaining sections of this unit put these ideas to work. Section 7.4 introduces Bayes' theorem, which is the formula for computing posterior probabilities from priors and likelihoods. Section 7.5 extends Bayesian ideas to networks of variables and text classification.
Before moving on, make sure you can answer:
-
What is the difference between a prior and a posterior?
-
What does it mean for a probability to be 0.5 vs. 0.95?
-
Why does logic fail when applied to scenarios with inherent uncertainty?
Apply the logic vs. probability contrast to a new scenario.
Based on the UC Berkeley CS 188 Online Textbook by Nikhil Sharma, Josh Hug, Jacky Liang, and Henry Zhu, licensed under CC BY-SA 4.0.
This work is licensed under CC BY-SA 4.0.