Bayesian methods are powerful tools for data science applications, complimenting traditional statistical and machine learning methods. Importantly, Bayesian models generate predictions and inferences that fully account for uncertainty. The main tool for conducting Bayesian analysis is Markov chain Monte Carlo (MCMC), a computationally-intensive numerical approach that allows a wide variety of models to be estimated. MCMC algorithms are available in several Python libraries, including PyMC3. I will teach users a practical, effective workflow for applying Bayesian statistics using MCMC via PyMC3 using real-world examples.
This course is intended for analysts, data scientists and machine learning practitioners. Anyone looking for effective ways of making predictions and obtaining inference from datasets should find it useful. The material will assume an intermediate level of Python familiarity. Ideally, attendees should be familiar with Numpy and Jupyter. There is no expectation of students having a statistical background.
The key learnings from the course will be
- What PyMC3 is for
- What MCMC is and why should I care
- How to know enough theano to not be scared by it
- How to diagnose things like model convergence and figure out if your model is good or not
- An introduction to Multi-level models or Bayesian Machine Learning secret sauce
- Contains 4 real world case studies including sports analytics, AB testing, policy analysis and predicting process anomalies
- Exclusive walkthroughs of other Bayesian Machine Learning tools like Arviz, Rainier and Pyro
Having completed the course, students should be able to build basic Bayesian statistical models using their own data, validate those models, and interpret their output