After completing this assignment, students should be able to:
Download the following files:
Complete the following nbgrader-based Jupyter notebook activity:
Complete the stubbed-out decision tree classifier in decision_tree.py
so that
all public methods and attributes correspond to the provided docstring
comments.
You may find it helpful to use draw_tree.py
to confirming that your
finished classifier is working correctly.
For this part of the assignment you will apply your decision tree implementation to the problem of predicting the number of bike rentals on the basis of time of day, time of year, and weather conditions.
For this part of the assignment you should submit a Jupyter notebook that satisfies the following requirements:
Each step of the notebook must be accompanied by text explaining the point of the provided Python code and discussing the results.
The following files contain the training and test data you must use for your analysis:
The class labels are integers in the range 0-3 where
The data set includes the following 12 attributes:
This data is a processed version of the Seoul Bike Sharing Demand Data Set.
Parts 1 and 2 must be completed individually.
Part 3 may be completed individually or in pairs. My expectation for pairs is that both members are actively involved, and take full responsibility for all aspects of the project. In other words, I expect that you are sitting down together to work, not that you are splitting up tasks to be completed separately.
If you intend to work with a partner, you must inform me no later than Tuesday 2/6.
Grades will be calculated according to the following distribution.
Readability/Style 10%
Your code should follow PEP8 conventions. It should be well documented and well organized.
Part 1 Submission 6%
Part 2 Reference Tests 54%
Part 2 Efficiency 10%
Our main concern with this assignment is clarity and correctness. That said, your implementation must be efficient enough execute the provided testing code in no more than a second or two. This means you should avoid Python loops where possible.
Part 3 Submission 20%