Probability is a numerical measure of the likelihood that an event will occur.
A probability function (or probability measure) is a function that assigns a number between 0 and 1 to each event in the sample space, where:
Interpretation:
Examples with a fair six-sided die:
Non-negativity: For any event
Total Probability: The probability of the sample space is 1,
Additivity: For mutually exclusive events
Definition: A random variable is a function that assigns a value (numeric or categorical) to each outcome in the sample space
Examples:
Die roll - Parity: If
Die roll - Numerical: Same sample space, but random variable
Range and Events:
A Probability Mass Function (PMF) assigns probabilities to the values of a discrete random variable.
Definition: For a discrete random variable
Requirements: A valid PMF must satisfy:
Example: For our die parity random variable
Verification:
Consider a medical scenario where we observe a single patient:
Experiment: Examine one patient and record their disease status and fever status
Sample Space:
Disease Random Variable (
Fever Random Variable (
The joint probability distribution represents the probability of two events, described by different random variables, happening together:
|
So This could also be written: |
Notice that the rows sum to 1
We can learn a joint probability distribution from data by counting occurrences and computing relative frequencies.
Example: Suppose we examine 10 patients and observe:
Data
|
Count each combination:
|
Compute probabilities: |
Estimated Joint Distribution
|
As we move from specific examples to general probability identities, we shift our notation:
Event-based notation:
Variable-based notation:
This shift allows us to write general identities like:
|
|
Note that
(by combining marginalization with the chain rule)
So Bayes rule can be expressed as:
We can use Bayes rule to build a classifier:
Where
There is a serious problem with this! What is it?
P(x) = .2 P(~x) = .8 P(c | x) = .2 P(f | x) = .7 P(h | x) = .1 P(c | ~x) = .1 P(f | ~x) = .05 P(h | ~x) = .85