Introduction to Probability for ML
“Training” a Naive Bayes Classifier
- Recall the naive Bayes classifier:
- \[P(Y \mid X_1, X_2, ..., X_d) \propto \prod_{i=1}^d P(X_i \mid Y)P(Y)\]
- To perform classification we need the:
- Class priors: \(P(Y)\)
- Class-conditional attribute probabilities: \(P(X_i \mid Y)\) for all \(i\).
- These were the tallies from our in-class exercise:
-
\(Spy\)
|
\(Golfer\)
|
\(Fedora\)
|
Count
|
T
|
T
|
T
|
1
|
T
|
T
|
F
|
3
|
T
|
F
|
T
|
1
|
T
|
F
|
F
|
0
|
F
|
T
|
T
|
4
|
F
|
T
|
F
|
3
|
F
|
F
|
T
|
6
|
F
|
F
|
F
|
2
|
- From this we can easily estimate our priors:
- \(P(Spy=True) = 5/20 = .25\)
- \(P(Spy=False) = 15/20 = .75\)
- We can also calculate the (full) class-conditional probability distributions:
\(Golfer\)
|
\(Fedora\)
|
\(P(Golfer, Fedora | Spy = True)\)
|
T
|
T
|
1/5 = .2
|
T
|
F
|
3/5 = .6
|
F
|
T
|
1/5 = .2
|
F
|
F
|
0
|
\(Golfer\)
|
\(Fedora\)
|
\(P(Golfer, Fedora | Spy = False)\)
|
T
|
T
|
4/15 \(\approx\) .27
|
T
|
F
|
3/15 = .2
|
F
|
T
|
6/15 = .4
|
F
|
F
|
2/15 \(\approx\) .13
|
- However, for naive Bayes classification we instead need this:
\(Golfer\)
|
\(P(Golfer | Spy = True)\)
|
T
|
4/5 = .8
|
F
|
1/5 = .2
|
\(Golfer\)
|
\(P(Golfer | Spy = False)\)
|
T
|
7/15 \(\approx\) .47
|
F
|
8/15 \(\approx\) .53
|
\(Fedora\)
|
\(P(Fedora | Spy = True)\)
|
T
|
2/5 = .4
|
F
|
3/5 = .6
|
\(Fedora\)
|
\(P(Fedora | Spy = False)\)
|
T
|
10/5 \(\approx\) .66
|
F
|
5/15 \(\approx\) .33
|