Estimation, Machine Learning, and Testing

This course will provide you with the intuition and skills required to design, implement, test and validate a variety of models for supervised learning. To introduce the course, we will cover the basics of statistical learning including modelling with the goal of prediction versus inference, prediction accuracy and model interpretability trade-off, and the all-important bias-variance trade-off. Each section of this course will cover a unique set of methods used for supervised learning on a data set.

Requirements

This course is designed for those who have a degree in something other than Computer Science/Statistics and are looking to enhance their data science skills for their career.

Learning Outcomes

  • The ability to understand, implement and interpret the results from several supervised learning approaches for regression and classification
  • The ability to utilize resampling methods when appropriate to extract more information from a data set and to choose the best model
  • How to perform exploratory data analysis for unsupervised learning
  • The ability to understand what is required for reproducible machine learning
  • The ability to appreciate the uncertainties associated with model results and the ethical consequences of acting on these results
  • The ability to recognize who matters in our models

Delivery Format and Schedule

Online for 7 hours/week for 3 weeks (21 hours in total).

2023 Dates

  • Tuesday 21 February, 6pm-8pm: Introduction; linear regression (simple and multiple, others)
  • Thursday 23 February, 6pm-8pm: Classification (logistic regression, generative models,
  • Saturday 25 February, 9am-noon: Resampling methods; Linear model selection and regularization
  • Monday 27 February, 6pm-8pm: Beyond linearity (regressions, step functions, generalized additive)
  • Thursday 2 March, 6pm-8pm: Tree-based methods (decision tree; bagging, random forest, etc.)
  • Saturday 4 March, 9am-noon: Support vector machines (SVM); survival analysis (Censored data)
  • Monday 6 March, 6pm-8pm: Principal components analysis; reproducibility; ethics; inequity
  • Thursday 9 March, 6pm-8pm: Professional skills: Industry case study
  • Saturday 11 March, 9am-noon: Estimation, Testing, and Learning: Review and Practice