Confidence Intervals

6 replies [Last post]
a9mike's picture
Offline
Joined: 06/28/2013

I'm not sure the best way to get confidence intervals for my estimates. I'm doing a bifactor model with a large dataset (HRS) with lots of missing data, and the mxCI are taking impossibly long (days long).

Anyone have suggestions? Would bootstrapping be faster? If so, what would that script look like? (I've never bootstrapped before)

AttachmentSize
refined CSS Model 2.R4.8 KB
neale's picture
Offline
Joined: 07/31/2009
Check identification

I assume you have not been able to obtain conventional standard errors? If they look reasonable, they can be used to derive confidence intervals. If these are not available, then take a look at identification. Although the model is likely identified in principle, it is possible that it suffers empirical under-identification, because two factors are essentially identical in structure. A signature of this would be paths estimated at or close to zero. Possibly, fixing one or two of them to zero would clear up the standard error calculation, and alleviate the need to estimate likelihood-based or bootstrap CIs

With factor models, there is a risk that factor loadings from a factor could all flip sign, and provide identical fit. This type of under-identification could mess up all three types of confidence interval estimation, but I think you are immune to that potential problem here because you have fixed a factor loading to 1 and estimated factor variances instead.

bwiernik's picture
Offline
Joined: 01/30/2014
Definitely bootstrap

Bifactor models typically have issues with Likelihood based confidence intervals, especially if one of your group factors shouldn't really be there (if it's entirely defined by the specific variance of one or two items because the items that load on the group factor really only load onto the general factor).

You're right that you should use bootstrapping. If you're not familiar with bootstrapping, the idea is that you create a sampling distribution for a statistic by repeatedly drawing samples with replacement from your data and then computing the statistic for each redrawn sample.

You have two major choices when it comes to bootstrapping. If you have raw data, you can use empirical bootstrapping, where you draw samples from your actual data and run models on them. As an alternative, if you are willing to accept distributional assumptions, you can combine bootstrapping with Monte Carlo simulation and generate simulated samples based on the expected or observed covariance matrix from your model. This is your only option if you don't have raw data, and it has also received quite a bit of support in simulation studies.

Here is a set of functions to generate bootstrapped confidence intervals for your model.

First, this is a helper function to extract the expected covariance matrix from an mxModel:

umxCov <- function(model,latent=TRUE, manifest=TRUE){
  mA <- mxEval(A,model)
  mS <- mxEval(S,model)
  mI <- diag(1, nrow(mA))
  mE <- solve(mI - mA)
  mCov <- (mE) %*% mS %*% t(mE) # The model-implied covariance matrix
  mV <- NULL
  if(latent) mV <- model@latentVars 
  if(manifest) mV <- c(mV,model@manifestVars)
  return(mCov[mV, mV]) # return only the selected variables
}

Second, here is the actual bootstrap function. This function has these parameters:
- model is an optimized mxModel
- samp is the raw data matrix used to estimate model
- type is the kind of bootstrap you want to run. "par.expected" and "par.observed" use parametric Monte Carlo bootstrapping based on your expected and observed covariance matrices, respective. "empirical" uses empirical bootstrapping based on samp.
- std specifies whether you want CIs for unstandardized or standardized parameters
- rep is the number of bootstrap samples to compute.
- conf is the confidence value
- dat specifies whether you want to store the bootstrapped data in the output (useful if you want to do multiple different analyses, such as mediation analyses)

umxCIpboot = function(model, samp=NULL, type=c("par.expected", "par.observed", "empirical"), std=TRUE, rep=1000, conf=95, dat=FALSE) {
  if(type=="par.expected") exp = umxCov(model,latent=FALSE)
  if(type=="par.observed") {
    if(model$data@type=="raw") {
      exp = var(mxEval(data,model))
    } else { if(model$data@type=="sscp") {
                        exp = mxEval(data,model)/(model$data@numObs-1)
                      } else {
                          exp = mxEval(data,model)
                        }
               }
      }  
  }
  N = round(model@data@numObs)
  pard = t(data.frame("mod"=summary(model)$parameters[,5+2*std],row.names=summary(model)$parameters[,1]))
  pb = txtProgressBar(min=0,max=rep,label="Computing confidence intervals",style=3)
  #####
  if(type=="empirical") {
    if(length(samp)==0) {
      if(model$data@type=="raw") samp = mxEval(data,model) else stop("No raw data supplied for empirical bootstrap.")
    }
    for(i in 1:rep){
      bsample.i = sample.int(N,size=N,replace=TRUE)
      bsample = var(samp[bsample.i,])
      mod = mxRun(mxModel(model,mxData(observed=bsample,type="cov",numObs=N)),silent=TRUE)
      pard = rbind(pard,summary(mod)$parameters[,5+2*std])
      rownames(pard)[nrow(pard)]=i
      setTxtProgressBar(pb,i)
    }
  }
  else {
    for(i in 1:rep){
      bsample = var(mvrnorm(N,rep(0,nrow(exp)),exp))
      mod = mxRun(mxModel(model,mxData(observed=bsample,type="cov",numObs=N)),silent=TRUE)
      pard = rbind(pard,summary(mod)$parameters[,5+2*std])
      rownames(pard)[nrow(pard)]=i
      setTxtProgressBar(pb,i)
    }
  }
  low=(1-conf/100)/2
  upp=((1-conf/100)/2)+(conf/100)
  LL=apply(pard,2,FUN=quantile,probs=low) #lower limit of confidence interval
  UL=apply(pard,2,FUN=quantile,probs=upp) #upper quantile for confidence interval
  LL4=round(LL,4)
  UL4=round(UL,4)
  ci = cbind(LL4, UL4)
  colnames(ci) = c(paste((low*100),"%",sep=""),paste((upp*100),"%",sep=""))
  p   = summary(model)$parameters[,c(1,2,3,4,c(5:6+2*std))]
  if(dat) return(list("Type"=type,"bootdat"=data.frame(pard),"CI"=cbind(p,ci)))
  return(list("CI"=cbind(p,ci)))
}

AttachmentSize
umxBootstrapping.R 2.5 KB
tbates's picture
Offline
Joined: 07/31/2009
added to library(umx)

That's a nice pair of functions!

I've added them to the umx library with some modifications and crediting this conversation and your name. Comments on all umx package functions welcome.

library(devtools)
install_github("tbates/umx")
library(umx)
?umxCI_boot #shows usage example
?umxGetExpectedCov

martonandko's picture
Offline
Joined: 02/19/2015
Trouble getting CI for twin ACE model

I'm trying to get bootstrapped CI for the estimated variables (A, C, E) using Neale's posted twin ACE script (http://openmx.psyc.virginia.edu/sites/default/files/UnivariateTwinAnalys...) using the umxCI_boot function, but I get the following warning message:

umxCI_boot(model = twinACEModel, type = "par.observed")
Error in umxCI_boot(model = twinACEModel, type = "par.observed") :
trying to get slot "type" from an object of a basic class ("NULL") with no slots

What am I doing wrong? What is the difference between the type = "par.observed" and type = "par.expected" options?

mhunter's picture
Offline
Joined: 07/31/2009
Update

Make sure you have the most recent versions of umx and OpenMx.

source("http://openmx.psyc.virginia.edu/getOpenMx.R")
library(OpenMx)
 
library(devtools)
install_github("tbates/umx")
library(umx)

martonandko's picture
Offline
Joined: 02/19/2015
Same problem

Thank you for your quick reply on both of my questions! Unfortunately I still get the same error message as before after updating OpenMx and umx. I'm attaching the model and the fitted object(just changed the file extension to .txt, so I could upload it), could someone have a look into it, what might be the problem?

AttachmentSize
Data_bootCI.txt 9.45 KB