Confidence Intervals

3 replies [Last post]
a9mike's picture
Joined: 06/28/2013

I'm not sure the best way to get confidence intervals for my estimates. I'm doing a bifactor model with a large dataset (HRS) with lots of missing data, and the mxCI are taking impossibly long (days long).

Anyone have suggestions? Would bootstrapping be faster? If so, what would that script look like? (I've never bootstrapped before)

refined CSS Model 2.R4.8 KB
neale's picture
Joined: 07/31/2009
Check identification

I assume you have not been able to obtain conventional standard errors? If they look reasonable, they can be used to derive confidence intervals. If these are not available, then take a look at identification. Although the model is likely identified in principle, it is possible that it suffers empirical under-identification, because two factors are essentially identical in structure. A signature of this would be paths estimated at or close to zero. Possibly, fixing one or two of them to zero would clear up the standard error calculation, and alleviate the need to estimate likelihood-based or bootstrap CIs

With factor models, there is a risk that factor loadings from a factor could all flip sign, and provide identical fit. This type of under-identification could mess up all three types of confidence interval estimation, but I think you are immune to that potential problem here because you have fixed a factor loading to 1 and estimated factor variances instead.

bwiernik's picture
Joined: 01/30/2014
Definitely bootstrap

Bifactor models typically have issues with Likelihood based confidence intervals, especially if one of your group factors shouldn't really be there (if it's entirely defined by the specific variance of one or two items because the items that load on the group factor really only load onto the general factor).

You're right that you should use bootstrapping. If you're not familiar with bootstrapping, the idea is that you create a sampling distribution for a statistic by repeatedly drawing samples with replacement from your data and then computing the statistic for each redrawn sample.

You have two major choices when it comes to bootstrapping. If you have raw data, you can use empirical bootstrapping, where you draw samples from your actual data and run models on them. As an alternative, if you are willing to accept distributional assumptions, you can combine bootstrapping with Monte Carlo simulation and generate simulated samples based on the expected or observed covariance matrix from your model. This is your only option if you don't have raw data, and it has also received quite a bit of support in simulation studies.

Here is a set of functions to generate bootstrapped confidence intervals for your model.

First, this is a helper function to extract the expected covariance matrix from an mxModel:

umxCov <- function(model,latent=TRUE, manifest=TRUE){
  mA <- mxEval(A,model)
  mS <- mxEval(S,model)
  mI <- diag(1, nrow(mA))
  mE <- solve(mI - mA)
  mCov <- (mE) %*% mS %*% t(mE) # The model-implied covariance matrix
  mV <- NULL
  if(latent) mV <- model@latentVars 
  if(manifest) mV <- c(mV,model@manifestVars)
  return(mCov[mV, mV]) # return only the selected variables

Second, here is the actual bootstrap function. This function has these parameters:
- model is an optimized mxModel
- samp is the raw data matrix used to estimate model
- type is the kind of bootstrap you want to run. "par.expected" and "par.observed" use parametric Monte Carlo bootstrapping based on your expected and observed covariance matrices, respective. "empirical" uses empirical bootstrapping based on samp.
- std specifies whether you want CIs for unstandardized or standardized parameters
- rep is the number of bootstrap samples to compute.
- conf is the confidence value
- dat specifies whether you want to store the bootstrapped data in the output (useful if you want to do multiple different analyses, such as mediation analyses)

umxCIpboot = function(model, samp=NULL, type=c("par.expected", "par.observed", "empirical"), std=TRUE, rep=1000, conf=95, dat=FALSE) {
  if(type=="par.expected") exp = umxCov(model,latent=FALSE)
  if(type=="par.observed") {
    if(model$data@type=="raw") {
      exp = var(mxEval(data,model))
    } else { if(model$data@type=="sscp") {
                        exp = mxEval(data,model)/(model$data@numObs-1)
                      } else {
                          exp = mxEval(data,model)
  N = round(model@data@numObs)
  pard = t(data.frame("mod"=summary(model)$parameters[,5+2*std],row.names=summary(model)$parameters[,1]))
  pb = txtProgressBar(min=0,max=rep,label="Computing confidence intervals",style=3)
  if(type=="empirical") {
    if(length(samp)==0) {
      if(model$data@type=="raw") samp = mxEval(data,model) else stop("No raw data supplied for empirical bootstrap.")
    for(i in 1:rep){
      bsample.i =,size=N,replace=TRUE)
      bsample = var(samp[bsample.i,])
      mod = mxRun(mxModel(model,mxData(observed=bsample,type="cov",numObs=N)),silent=TRUE)
      pard = rbind(pard,summary(mod)$parameters[,5+2*std])
  else {
    for(i in 1:rep){
      bsample = var(mvrnorm(N,rep(0,nrow(exp)),exp))
      mod = mxRun(mxModel(model,mxData(observed=bsample,type="cov",numObs=N)),silent=TRUE)
      pard = rbind(pard,summary(mod)$parameters[,5+2*std])
  LL=apply(pard,2,FUN=quantile,probs=low) #lower limit of confidence interval
  UL=apply(pard,2,FUN=quantile,probs=upp) #upper quantile for confidence interval
  ci = cbind(LL4, UL4)
  colnames(ci) = c(paste((low*100),"%",sep=""),paste((upp*100),"%",sep=""))
  p   = summary(model)$parameters[,c(1,2,3,4,c(5:6+2*std))]
  if(dat) return(list("Type"=type,"bootdat"=data.frame(pard),"CI"=cbind(p,ci)))

umxBootstrapping.R 2.5 KB
tbates's picture
Joined: 07/31/2009
added to library(umx)

That's a nice pair of functions!

I've added them to the umx library with some modifications and crediting this conversation and your name. Comments on all umx package functions welcome.

?umxCI_boot #shows usage example