6.7830 Bayesian Modeling and Inference
Times: Tuesday, Thursday 2:30–4:00 PM
First class: Tuesday, February 7
Professor Tamara Broderick
Office Hours: Thursdays, 4–5pm
Office Hours: Tuesdays, 4–5pm
As both the number and size of data sets grow, practitioners are interested in learning increasingly complex information and interactions from data. Probabilistic modeling in general, and Bayesian approaches in particular, provide a unifying framework for flexible modeling that includes prediction, estimation, and coherent uncertainty quantification. In this course, we will cover modern challenges of Bayesian inference, including (but not limited to) model construction, handling large or complex data sets, and the speed and quality of approximate inference. We will study Bayesian nonparametric methods, wherein model complexity grows with the size of the data; these methods allow us to learn, e.g., a greater diversity of topics as we read more documents from Wikipedia, identify more friend groups as we process more of Facebook's network structure, etc.
Our course Piazza page is here: https://piazza.com/mit/spring2023/67830
All communication about the course will be through Piazza, so make sure to sign up there as soon as possible.
Note that this class is heavily based on discussion, reading research papers, and active student participation.
Nothing will be formally due or graded during the first week of class.
This course will cover Bayesian modeling and inference at an advanced graduate level. A tentative list of topics (which may change depending on our interests) is as follows:
- Introduction to Bayesian inference; motivations from de Finetti, decision theory, etc.
- Hierarchical modeling, including popular models such as latent Dirichlet allocation
- Approximate posterior inference
- Variational inference, mean-field, stochastic variational inference, challenges/limitations of VI, etc.
- Monte Carlo, avoiding random-walk behavior, Hamiltonian Monte Carlo/NUTS/Stan, etc.
- Evaluation, sensitivity, robustness
- Bayesian nonparametrics: why and how
- Mixture models, admixtures, Dirichlet process, Chinese restaurant process
- Learning functions, Gaussian processes
- Probabilistic numerics
- Bayesian optimization
Requirements: A pre-existing graduate-level familiarity with machine learning/statistics and probability is required. E.g. at MIT, 6.7800 or 6.7810 or [6.7900 and 6.7700]. (In the old numbering scheme, these courses would be: 6.437 or 6.438 or [6.867 and 6.436].) Past students have found Algorithms for Inference, 6.7810 (used to be 6.438), particularly useful as a prerequisite.
We will assume familiarity with graphical models, exponential families, finite-dimensional Gaussian mixture models, expectation maximization, linear & logistic regression, hidden Markov models. You can find a "problem set 0" on the Piazza page to help you gauge your background; it is not graded, but you should be very comfortable solving the questions in it strictly before taking this course.