CS 445 Machine Learning Exam 1 Review

Fall 2020

Exam Logistics

This exam is an open book, open notes, closed Internet exam. This exam is to be completed on your own with no collaboration with others. The exam, once started, needs to be completed within 2 hours. The exam questions will be supplied in Canvas. Anything that might be easier for you to draw can be drawn by hand (or on your computer) and attached to your Canvas submission.

Exam Topics

In general, the topics that are fair game for this exam are:
  • Any topic within the learning objectives from Week 1 to Week 5
  • Any topic/concepts within the many labs we have been doing in class
  • Any topic from the assigned reading in the book
  • Any numpy or python idea from the assigned review and numpy videos
The list below is a little more specific, and tries to highlight topics that may have only been specifically discussed in the assigned reading.
  • Optimistic and pessimistic error estimates (section 3.5.2 in IDD). We have discussed in class and with labs most of this material, but we did not specifically use the formulas.
  • Feature selection -- challenges, estimates, applicability to classifiers
  • Min-max normalization
  • Curse of dimensionality
  • Classifier strengths/weaknesses with respect to:
    • online learning
    • linear decision boundaries
    • sensitivity to noise
    • sensitivity to outliers
    • features with different magnitudes
    • high dimensional data
    • memory requirement for building the classifier
    • memory requirement for running/predict
    • CPU/run time peformance for building the classifier, expressed in big Oh notation
    • CPU/run time peformance for running predict the classifier, expressed in big Oh notation
    • Ability to explain the model to a person not familiar with machine learning
Questions from the textbook that might be helpful for the exam:
  • 2.1 - 2.3
  • 3.1 - 3.8 (only worry about entropy, skip anything with GINI)