CS 445 -- Machine Learning

Semester Specifics


Course Description

Can computers learn? And if so, how can a computer learn? In this course, you will be introduced to machine learning, a technique which applies algorithms that enable systems to "learn by example". Google search utilizes it to complete the search criterion as you are typing, and it enables Netflix to make good movie recommendations. Machine learning is prevalent in many fields: autonomous driving, detecting credit card fraud and cyber attacks, and organizing/searching through your every growing set of photos on your phone. Using machine learning foregoes writing very complex functional programs and instead utilizes one of a set of core learning algorithms. For example, imagine writing a traditional program to decode a zip code from a picture of an envelope. Machine learning utilizes examples of hand written zip codes (digits) and "learns" from these examples to recognize/decode zip codes from millions of pieces of mail each day. This course focuses on a balance of theoretical and practical knowledge. Popular methods such as neural networks, deep learning, and support vector machines will be covered. Small projects will allow students to apply these techniques and showcase their results in a quantitative manner. Data science, which includes the field of machine learning, is the fastest growing subfield in computer science (with the highest predicted job growth over the next 10 years), so, I hope you will join me for this exciting course. The math and statistics required for this course are reviewed in appendices A through D.2 of the textbook and are available here for previewing. This material will be briefly discussed in class, but in general, students will be required to study this information outside of classroom time and seek help as needed through office hours.

Classroom time will be dedicated 35% to lecture and 65% to inclass activities to reinforce the material.

This course will have 4 programming projects (3 small and 1 medium size).

Prerequisites: A grade of "C-" or better in CS 327.

This learning objectives/topics covered in this course are:

Course Material/Textbook

Introduction to Data Mining -- second edition

This course will use the 2nd edition of Introduction to Data Mining by Tan Steinbach, Karpatne, and Kumar. Other material, videos, and papers will be assigned.

Programming Languages

All programming work for this course will utilize the Python 3 programming language, which is included on the Ubuntu Mint virtual machine image maintained by JMU's own Unix Users Group (UUG). This class will also utilize Keras for implementing neural networks. You will be expected to learn Python on your own (which is not that difficult given that you already know at least Java). Usually, the first programming assignment will be something simple, just to make sure you are familiar with the language.

Here are a few references to familarize yourself