The joint probability distribution represents the probability of two events, described by different random variables, happening together:
|
So This could also be written: |
|
|
Note that
(by combining marginalization with the chain rule)
So Bayes rule can be expressed as:
We can use Bayes rule to build a classifier:
Where
There is a serious problem with this! What is it?
P(x) = .2 P(~x) = .8 P(c | x) = .2 P(f | x) = .7 P(h | x) = .1 P(c | ~x) = .1 P(f | ~x) = .05 P(h | ~x) = .85