Maximum Likelihood Learning and Logistic Regression

Bayesian Learning

Maximum a Posteriori Learning

Maximum Likelihood Learning

Turning our single layer network into a probability distribution…

Aside: Derivative of the Sigmoid/Logistic Function

The logistic function has a simple derivative:

\[\sigma'(x) = \sigma(x)(1 - \sigma(x))\]

One Weird Trick…

Maximum Likelihood for Logistic Regression:

Why not just use MSE?

Minimizing Cross Entropy Loss for Logistic Regression

Pros and Cons of Logistic Regression