The goal of this course is to provide a broad introduction to the key ideas in machine learning. The emphasis will be on intuition and practical examples rather than theoretical results. Through a variety of lecture examples and programming projects, you will learn how to apply powerful machine-learning techniques to new problems, how to run evaluations and interpret results, and how to think about scaling up from thousands of data points to billions.
This class meets for one 90 min class periods each week. It includes three guided programming projects and one more open-ended final project.
All materials in this course are posted on GitHub in the form of Jupyter notebooks.
Course Prerequisites
Core data science courses: research design, storing and retrieving data, exploring and analyzing data.
Undergraduate-level probability and statistics. Linear algebra is recommended.
Programming Prerequisites
Python (v3). We will be primarly using numpy and scikit-learn.
Jupiter and JupiterLab notebooks. You can install them in your computer using pip or Anaconda. More information here.
Git(Hub), including clone/commmit/push from the command line. You can sign up for an account here.
OS
Mac/Windows/Linux are all acceptable to use.
Textbook
Assignments
Final Project
Week | Lecture | Lecture Materials | Deadlines (6:30 pm PT) |
---|---|---|---|
Supervised Learning | |||
05/04 - 05/10 | Introduction | Week 1 | |
05/11 - 05/17 | Nearest neighbors | Week 2 | |
05/18 - 05/24 | Naive Bayes | Week 3 | Project 1 (Part 1-5) |
05/25 - 05/31 | Decission trees | Week 4 | |
06/01 - 06/07 | Cross-validation and Ensemble learning | Week 5 | Project 1 (Part 6-11) |
06/08 - 06/14 | Regression analysis | Week 6 | Final project: group and dataset |
06/15 - 06/21 | Neural networks | Week 7 | |
06/22 - 06/28 | Support vector machines | Week 8 | |
Unsupervised Learning | |||
06/29 - 07/05 | Cluster analysis | Week 9 | Project 2 |
07/06 - 07/12 | Gaussian mixture models | Week 10 | Final project: baseline presentation |
07/13 - 07/19 | Dimensionality reduction | Week 11 | |
Other Topics | |||
07/20 - 07/26 | Network analysis | Week 12 | |
07/27 - 08/02 | Recommender systems | Week 13 | Project 3 |
08/03 - 08/09 | Wrap-up | Week 14 | Final project: code and presentation |
For the final project you will form a group (3-4 people are ideal; 2-5 people are allowed; no 1 person group allowed (DON'T ASK).
Your group can only include members from the section in which you are enrolled.
You will pick your own dataset, but I will also provide some suggestions.
Deadlines to remember:
A few project ideas:
PowerPoint slide ideas (feel free to use Jupyter Notebook Slides or other presentation tools):
A Github Classroom link will be provided for each project. When you click on the link you will create a private repo (I already have admin rights).
Once you are ready to submit, commit and push changes to your private repo.
In ISVC, you will only submit the link to your private repo (DO NOT upload the Jupyter Notebook).
Links:
Participation | 5% |
Programming projects | 20% (x3) |
Final project | 35% |