Machine Learning and Nonparametric Bayes Tutorial

This tutorial took place at the ISBA 2016 World Meeting Short Courses at the University of Cagliari in Sardinia, Italy. See this link for the latest versions and videos of the Nonparametric Bayes parts of this tutorial.

Sunday, June 12
Machine Learning (Peter Orbanz): 9:00 AM–12:00 PM
Nonparametric Bayes Part I (Tamara Broderick): 2:00–3:00 PM
Nonparametric Bayes Part II (Tamara Broderick) 3:15–5:00 PM

Instructors:
Professor Peter Orbanz and Professor Tamara Broderick
Email:

Description

This tutorial consists of two parts, one on Machine Learning, one on Bayesian Nonparametrics.

The machine learning part of the tutorial will try to give an overview of some widely used methods such as neural networks, SVMs, and Random Forests, and of how they relate to each other.

Nonparametric Bayesian methods make use of infinite-dimensional mathematical structures to allow the practitioner to learn more from their data as the size of their data set grows. What does that mean, and how does it work in practice? In this part of the tutorial, we'll cover why machine learning and statistics need more than just parametric Bayesian inference. We'll introduce such foundational nonparametric Bayesian models as the Dirichlet process and Chinese restaurant process and touch on the wide variety of models available in nonparametric Bayes. Along the way, we'll see what exactly nonparametric Bayesian methods are and what they accomplish.

Materials

Machine Learning

[Slides]

Nonparametric Bayes

README for demos
[Slides for Part I]
- Demo 1 [code]: Beta random variable and random distribution intuition
- Demo 2 [code]: Dirichlet random variable and random distribution intuition
- Demo 3 [code]: K large relative to N intuition; empty components
- Demo 4 [code]: K large relative to N intuition; growth of number of clusters
[Slides for Part II]
- Demo 5 [code]: GEM random distribution intuition
- Demo 6 [code]: An exact DPMM simulator

Prerequisites

Know what a prior, likelihood, and posterior are.
Know how to use Bayes' Theorem to calculate a posterior for both discrete and continuous parametric distributions.
Understand what a generative model is.
Have a basic idea of what Gibbs sampling is and when it is useful (at least check out the Wikipedia article in advance).

What we won't cover

Gaussian processes are an important branch of nonparametric Bayesian modeling, but we won't have time to cover them here. We'll be focusing on the discrete, or Poisson point process, side of nonparametric Bayesian inference.