Introduction


A typical machine learning task can be viewed as a three stage process.

(1) Collect data and transform it into a computationally manageable resource.
(2) Learn a representation or model of the process that generated the data.
(3) Use the learned model to perform prediction or inference tasks.

As a concrete example, we might consider a classification problem: Suppose we are given a collection of emails, each of which is labelled as spam or not spam. That is, our data consists of correctly labeled email. Our goal is, given an unseen email, to predict whether or not it is a piece of spam email. In order to do the prediction, we will first use the labeled data to build a model of what it means to be a piece of spam email. Then, we will use this model to predict the label of a new, unseen email.

In this collection of notes, we will study graphical models, a compact way to represent distributions over certain kinds of structured data. To learn graphical models efficiently in a variety of settings, we will need a solid understanding of both exact and approximate inference techniques.

These notes were compiled as part of "CS 6347: Statistical Methods in Artificial Intelligence and Machine Learning" at the University of Texas at Dallas. The aim was to post a set of comprehensive notes that, although terse in parts, would better align with the presentaiton of the course material than the currently available textbooks. The text has been augmented with interactive excercises and examples, and the reader is encouraged to attempt the solutions before viewing the answers.

Our hope is that this resource will grow and be refined as the course is taught in subsequent years. Questions, comments, and corrections are welcome and should be directed to the author at nicholas.ruozzi (at) utdallas (dot) edu.

Creative Commons License