Gaussian factor models

By a Gaussian factor model, I refer to the following specification:

$X_i = \mathbf{B}f_i + \epsilon_i; \epsilon_i \sim N(0, \Psi); f_i \sim N(0, I_k).$

Each observation $X_i$ is a p-dimensional column vector, $\mathbf{B}$ is a p-by-k real-valued matrix of “factor loadings”, and the “factor scores” $f_i$ are k-dimensional column vectors. The “idiosyncratic” errors, $\epsilon_i$ , is a p-dimensional vector of independent noise (e.g. $\Psi$ is diagonal).

Factor models are, in my view, a method for doing covariance estimation where the covariance is decomposed to be of the form

$BB^t + \Psi.$

See my MathOverflow digression on the topic of how factor models compare to principle component analysis (PCA).

Computationally, this decomposition has the effect of recasting the covariance estimation problem as a series of latent linear regressions, meaning that, procedurally, we can estimate these models using the same essential updates as linear regression. My go to reference for the sampling steps in a Gibbs sampler for a Gaussian factor model is Carlos Carvalho’s dissertation, chapter 6.

Here is an R script for fitting a Gaussian latent factor model using conditionally conjugate priors.

Share this:

Related

Leave a comment Cancel reply