Decision Tree Programming Assignment

Learning Objectives

After completing this assignment, students should be able to:

Starter Code

Download the following files:

Part 1: Warm Up

Complete the following nbgrader-based Jupyter notebook activity:

Part 2: Implementation

Complete the stubbed-out decision tree classifier in decision_tree.py so that all public methods and attributes correspond to the provided docstring comments.

You may find it helpful to use draw_tree.py to confirming that your finished classifier is working correctly.

Part 3: Analysis

For this part of the assignment you will apply your decision tree implementation to the problem of predicting the number of bike rentals on the basis of time of day, time of year, and weather conditions.

For this part of the assignment you should submit a Jupyter notebook that satisfies the following requirements:

Each step of the notebook must be accompanied by text explaining the point of the provided Python code and discussing the results.

Data Set

The following files contain the training and test data you must use for your analysis:

The class labels are integers in the range 0-3 where

The data set includes the following 12 attributes:

This data is a processed version of the Seoul Bike Sharing Demand Data Set.

Partners

Parts 1 and 2 must be completed individually.

Part 3 may be completed individually or in pairs. My expectation for pairs is that both members are actively involved, and take full responsibility for all aspects of the project. In other words, I expect that you are sitting down together to work, not that you are splitting up tasks to be completed separately.

If you intend to work with a partner, you must inform me no later than Tuesday 2/6.

Grading

Grades will be calculated according to the following distribution.