The problem of determine whether a person is present in an image has important practical applications in many areas, including video surveillance, autonomous vehicles, digital photography etc. For this project you will use machine learning algorithms to build an image classifier that can determine whether or not a particular image contains a person.
The data for this task is taken from CIFAR-100, a widely used data set that contains 50,000 small images drawn from 100 different categories. Five of these categories represent people: men, women, boys, girls and babies. The positive examples for our classification task are all drawn from these five categories. The negative examples are randomly selected from all of the remaining categories. The training data includes 2,500 images of people, and 2,500 images that do not contain people. The testing data set contains 1000 non-labeled images.
Here are some examples of images that contain people:
Here are some examples of images that do not contain people:
The data can be downloaded from the project Kaggle page. That page also describes the data format in more detail, and provides some sample Python code to get you started. If you prefer, here are all of the necessary files in a single .zip file: sample_solution.zip.
For full credit you must apply at least three different learning algorithms to this problem and provide a comparison of the results. You do not need to implement all three algorithms from scratch. There are a number of mature machine learning libraries available for Python, including:
You do need to provide your own implementation of at least one learning algorithm for this problem. You are welcome to use the single layer neural network that we worked on as an in-class exercise, or you may implement something else if you prefer.
For full credit, you must achieve a classification rate above 70%.
Along with your completed Python code, you must also submit a short (2-3 page) report describing your approach to the problem and your results. Your report will be graded on the basis of content as well as style. Your writing should be clear, concise, well-organized, and grammatically correct.