This course provides a practical introduction to the rapidly growing field of machine learning— training predictive models to generalize to new data. We start with linear and logistic regression and implement gradient descent for these algorithms, the core engine for training. With these key building blocks, we work our way to understanding widely used neural network architectures, focusing on intuition and implementation with TensorFlow/Keras. While the course centers on neural networks, we will make sure to cover key ideas in unsupervised learning and nonparametric modeling.
Along the way, weekly short coding assignments and a midterm exam will connect lectures with concrete data and real applications. A more open-ended final project will tie together crucial concepts in experimental design and analysis with models and training.
This class meets for one 90 min class periods each week.
All materials for this course are posted on GitHub in the form of Jupyter notebooks.
Course Prerequisites
Core data science courses: research design, storing and retrieving data, exploring and analyzing data.
Undergraduate-level probability and statistics. Linear algebra is recommended.
Programming Prerequisites
Python (v3).
Jupiter and JupiterLab notebooks. You can install them in your computer using pip or Anaconda. More information here.
Git(Hub), including clone/commmit/push from the command line. You can sign up for an account here.
If you have a MacOS M1, this .sh script will install everything for you (credit goes to one of my former students, Kevin Stallone)
OS
Mac/Windows/Linux are all acceptable to use.
Textbook
Assignments
Midterm exam
Final Project
Week | Lecture | Lecture Materials | Readings | Deadlines (Sunday of the week, 11:59 pm PT) | |
---|---|---|---|---|---|
Supervised and Unsupervised Learning | |||||
Aug 28-Sept 03 | Introduction and Framing | Week 01 | |||
Sept 04-10 | Linear Regression - Gradient Descent | Week 02 | RM (10, 13 - intro to TensorFlow only), feature scaling, more math (1) | Assignment 1 | |
Sept 11-17 | Linear Regression - Feature Engineering | Week 03 | RM (4, 2), Ilin et al. (2021) | Assignment 2 | |
Sept 18-24 | Logistic Regression - Binary | Week 04 | RM (3, 6 (p.211-219)), more math (2) | Assignment 3 Group, question, and dataset for final project |
|
Sept 25-Oct 01 | Logistic Regression - Multiclass | Week 05 | RM (3, 6 (p.211-219)), more intuition | Assignment 4 | |
Oct 02-08 | Feedforward Neural Networks | Week 06 | RM (12, 13, 14), activation functions, regularization | Assignment 5 | |
Oct 09-15 | KNN, Decision Trees, and Ensembles | Week 07 | RM (3, 7), Psaltos et al (2022) | Assignment 6 Midterm exam |
|
Oct 16-22 | Unsupervised Learning: K-Means and PCA Project: baseline presentation | Week 08 | RM (11) | Assignment 7 Baseline presentation: slides |
|
Oct 23-29 | Embeddings for Text | Week 09 | RM (8, 16) | Assignment 8 | |
Oct 30-Nov 05 | Convolutional Neural Networks | Week 10 | RM (15), 1D CNN intuition, Yoon Kim (2014) | Assignment 9 | |
Nov 06-12 | Fall Break |
||||
Nov 13-19 | Network Architecture and Debugging ML algorithms |
Week 11 | Andrew Ng's advice for Applying ML | Assignment 10 |
|
Nov 20-26 | Thanksgiving Break |
||||
Nov 27-Dec 03 | Fairness in ML | Week 12 | Suresh and Guttag (2021) | ||
Dec 04-10 | Advanced Topics: RNN/LSTMs, Transformers, BERT | Week 13 | Rashka et al, ch. 16 (2022) | ||
Dec 11-17 | Project: final presentation | Final presentation: slides and code |
How do I take the exam?
What is the best way to prepare for the exam?
Can I use ChatGPT?
How can I see my grade?
What is the best way to access the exam solutions?
For the final project you will form a group (3-4 people are ideal). Grades will be calibrated by group size and individual contributions. Your group can only include members from the section in which you are enrolled.
Do not just re-run an existing code repository; at the minimum, you must demonstrate the ability to perform thoughtful data preprocessing and analysis (e.g., data cleaning, model training, hyperparameter selection, model evaluation).
The topic of your project is totally flexible (see also below some project ideas).
Deadlines to remember:
A few project ideas (from my Summer 2022 students):
Baseline presentation. Your slides should include:
Final presentation. Your slides should include:
Participation | 5% |
Assignments | 45% |
Midterm | 20% |
Final project | 30% |
Integrating a diverse set of experiences is important for a more comprehensive understanding of machine learning. I will make an effort to read papers and hear from a diverse group of practitioners, still, limits exist on this diversity in the field of machine learning. I acknowledge that it is possible that there may be both overt and covert biases in the material due to the lens with which it was created. I would like to nurture a learning environment that supports a diversity of thoughts, perspectives and experiences, and honors your identities (including race, gender, class, sexuality, religion, ability, veteran status, etc.) in the spirit of the UC Berkeley Principles of Community.
To help accomplish this, please contact me or submit anonymous feedback through I School channels if you have any suggestions to improve the quality of the course. If you have a name and/or set of pronouns that you prefer I use, please let me know. If something was said in class (by anyone) or you experience anything that makes you feel uncomfortable, please talk to me about it. If you feel like your performance in the class is being impacted by experiences outside of class, please don’t hesitate to come and talk with me. I want to be a resource for you. Also, anonymous feedback is always an option, and may lead to me to make a general announcement to the class, if necessary, to address your concerns.
As a participant in teamwork and course discussions, you should also strive to honor the diversity of your classmates.
If you prefer to speak with someone outside of the course, MICS Academic Director Lisa Ho, I School Assistant Dean of Academic Programs Catherine Cronquist Browning, and the UC Berkeley Office for Graduate Diversity are excellent resources. Also see the following link.