Bayes' theorem

Bayes' Theorem

Bayes' theorem is a fundamental result in probability theory that describes how to update the probabilities of hypotheses when given evidence. It is named after Thomas Bayes, an 18th-century British mathematician. Bayes' theorem is based on the concepts of conditional probability and prior probability.

Understanding Conditional Probability

Before diving into Bayes' theorem, it's important to understand conditional probability. The conditional probability of an event A given that event B has occurred is denoted by P(A|B) and is calculated as:

$$ P(A|B) = \frac{P(A \cap B)}{P(B)} $$

where:

( P(A \cap B) ) is the probability that both events A and B occur.
( P(B) ) is the probability that event B occurs.

Bayes' Theorem Formula

Bayes' theorem relates the conditional and marginal probabilities of stochastic events A and B. It is expressed as:

$$ P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)} $$

where:

( P(A|B) ) is the probability of A given B (posterior probability).
( P(B|A) ) is the probability of B given A.
( P(A) ) is the probability of A (prior probability).
( P(B) ) is the probability of B.

Law of Total Probability

To calculate ( P(B) ), we often use the law of total probability, which states that if ( {A_1, A_2, \ldots, A_n} ) is a partition of the sample space, then:

$$ P(B) = \sum_{i=1}^{n} P(B|A_i) \cdot P(A_i) $$

Bayes' Theorem Expanded

Using the law of total probability, Bayes' theorem can be expanded to:

$$ P(A_k|B) = \frac{P(B|A_k) \cdot P(A_k)}{\sum_{i=1}^{n} P(B|A_i) \cdot P(A_i)} $$

where ( A_1, A_2, \ldots, A_n ) are mutually exclusive and exhaustive events.

Differences and Important Points

Here's a table summarizing some key differences and important points related to Bayes' theorem:

Point	Description
Prior Probability	( P(A) ) is the initial probability of an event before new evidence is considered.
Posterior Probability	( P(A
Likelihood	( P(B
Evidence Probability	( P(B) ) is the probability of the evidence under all possible hypotheses.
Partition	The events ( A_1, A_2, \ldots, A_n ) must be mutually exclusive and collectively exhaustive.

Examples

Example 1: Medical Diagnosis

Suppose a certain disease affects 1% of a population. A test for the disease is 99% accurate: it gives a positive result for 99% of people who have the disease and a negative result for 99% of people who do not have the disease. If a person tests positive, what is the probability they actually have the disease?

Let's define the events:

( A ) = having the disease.
( B ) = testing positive.

We are given:

( P(A) = 0.01 ) (prior probability of having the disease).
( P(B|A) = 0.99 ) (probability of testing positive given the disease).
( P(B|\neg A) = 0.01 ) (probability of testing positive given no disease).

Using Bayes' theorem:

$$ P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B|A) \cdot P(A) + P(B|\neg A) \cdot P(\neg A)} $$

$$ P(A|B) = \frac{0.99 \cdot 0.01}{0.99 \cdot 0.01 + 0.01 \cdot 0.99} $$

$$ P(A|B) = \frac{0.0099}{0.0099 + 0.0099} $$

$$ P(A|B) = \frac{0.0099}{0.0198} $$

$$ P(A|B) \approx 0.5 $$

So, despite the high accuracy of the test, a person who tests positive has only a 50% chance of actually having the disease due to the low prevalence of the disease in the population.

Example 2: Spam Email Filter

An email filter is designed to identify spam emails. Let's say:

2% of all emails received are spam.
The filter correctly identifies 98% of spam emails (true positives).
The filter incorrectly marks 1% of non-spam emails as spam (false positives).

What is the probability that an email marked as spam is actually spam?

Let's define the events:

( A ) = the email is spam.
( B ) = the email is marked as spam.

We are given:

( P(A) = 0.02 ) (prior probability of an email being spam).
( P(B|A) = 0.98 ) (probability of an email being marked as spam given it is spam).
( P(B|\neg A) = 0.01 ) (probability of an email being marked as spam given it is not spam).

Using Bayes' theorem:

$$ P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B|A) \cdot P(A) + P(B|\neg A) \cdot P(\neg A)} $$

$$ P(A|B) = \frac{0.98 \cdot 0.02}{0.98 \cdot 0.02 + 0.01 \cdot 0.98} $$

$$ P(A|B) = \frac{0.0196}{0.0196 + 0.0098} $$

$$ P(A|B) = \frac{0.0196}{0.0294} $$

$$ P(A|B) \approx 0.6667 $$

Thus, there is approximately a 66.67% chance that an email marked as spam by the filter is actually spam.

Bayes' theorem is a powerful tool for updating probabilities based on new information and is widely used in various fields such as statistics, machine learning, medical testing, and more.