In class on Wednesday I briefly described a bivariate mixture model and I “live-coded” an R implementation of a Gibbs sampler for performing posterior inference. In this post I’m just going to revisit these same points.
A Gaussian mixture model has a density function that is a weighted sum of Gaussian density functions, with weights that are probabilities summing to one; they can be multivariate Gaussians, but here we will consider univariate random variables. So the density function is
where is the Gaussian pdf with mean
and variance
. Each of the individual density functions are referred to as the “components” of the mixture.
One can (should?) think of the random variable being drawn compositionally, first drawing a (“latent”) random variable
and then drawing
from a Gaussian distribution with parameters
. That is
where ,
, and
.
Note: beware that this is not the same as , which is normally distributed, while
has density
above.
To perform Bayesian inference on the five parameters in the model, , we must specify priors and then compute the posterior distribution. For priors we will use conditionally conjugate specifications, allowing us to use things we’ve already learned about normal models, inverse-gamma priors, beta-binomial models, etc.
The posterior distribution will not be in closed form, so we will resort (gladly) to posterior sampling using a Gibbs sampler. Recall that a Gibbs sampler cycles through each parameter, one at a time, and draws that parameter from the “full conditional” — the posterior distribution of that parameters given all the other parameters (fixed at value of their most recent draw/iteration from the Markov chain).
The key to this strategy will be to actually sample the latent indicator variable . With
vector in hand all of the other updates are from standard exponential family models with known closed forms (available in convenient form on my favorite web site). Similarly, the update(s) for
arise as a straightforward application of Bayes rule. The probability that observation $Y_i$ arose from component number one (with parameters
) is just:
Observe that in this expression is conditionally independent of all
given the component parameters.
See this R script for a simple implementation of the Gibbs sampler, from which you can find the finer details. (As always, let me know by email if you spot typos.)
