Confidence intervals (granted)

31 replies [Last post]
tbates's picture
Offline
Joined: 07/31/2009

Confidence intervals are basic to most results sections now, so implementing Mx's 'Intervals' likelihood-based confidence intervals (Neale & Miller, 1997) (plus standard error-based CIs for speed) would be high on the wish-list

maybe something like

mxInterval(..., percent=95, type="ML") where ... consists of matrix addresses

mxInterval(z.A) would mean all cells of A
mxInterval(z.A[1:2,], type="SE") first two rows of A, standard-error based method

pgseye's picture
Offline
Joined: 10/13/2009
Can I check that I understand

Can I check that I understand this correctly and that CI's are possible (although fiddly) in the multivariate case?

Based on the code Mike wrote above, If I have 4 variables and a best fit AE model, then by specifying A for each variable (a11*a11, a22*a22, etc), I should be able to obtain corresponding CI's, as in;

A <- mxEval(a11*a11, multiCholAEFit)

CIaupper<-mxModel(multiCholAEFit,mxMatrix("Full",values=mxEval(objective, multiCholAEFit),name="oldfit"),mxAlgebra(((oldfit+3.84)-(MZ.objective+DZ.objective+Sib.objective))^2-A,name="upperCIa"),mxAlgebraObjective("upperCIa"))

CIalower<-mxModel(multiCholAEFit,mxMatrix("Full",values=mxEval(objective, multiCholAEFit),name="oldfit"),mxAlgebra(((oldfit+3.84)-(MZ.objective+DZ.objective+Sib.objective))^2+A,name="lowerCIa"),mxAlgebraObjective("lowerCIa"))

runCIalower<-mxRun(CIalower)

runCIalower@algebras$A

runCIaupper<-mxRun(CIaupper)

runCIaupper@algebras$A

If this is right, like Cathy and Marcel's recent post, I keep running into code reds (even when re-running an optimised fit).

neale's picture
Offline
Joined: 07/31/2009
I wouldn't use mxEval() for

I wouldn't use mxEval() for this purpose. Rather I would create a matrix within the multiCholAEFit using an mxAlgebra(). I don't know if there's a problem with mxEval() but I usually think of that as a sort of after-the-fact computation, and I'm not sure if it gets calculated during optimization. If it does, since it will parse any R expression, I would expect it to be slower than using mxAlgebra. If it does not, then I would expect code reds and no change to the value in matrix A.

Also, please be aware that a better interface for requesting & estimating confidence intervals is due to be released shortly.

mspiegel's picture
Offline
Joined: 07/31/2009
That's correct. mxEval()

That's correct. mxEval() evaluates its expression using the current state of the model, and cannot be updated during a run of a model. Use mxAlgebra() inside a model to create an expression that will be re-calculated as the optimizer traverses the search space.

And yes, a confidence interval interface will be released in OpenMx 1.0.

trinewa's picture
Offline
Joined: 11/30/2009
How did you end up solving

How did you end up solving the question of univariate and multivariate confidence intervals? Any link to example scripts which provide CI's in the multivariate case?

Ryne's picture
Offline
Joined: 07/31/2009
If I'm reading Tim's

If I'm reading Tim's suggestion correctly, this works more like mxEval than an mxModel slot. I'm with Steve that we shouldn't add more to mxModel, but would a function that took either (a) parameters and standard errors, or (b) an @output slot work? Is it possible that (a) is already out there in the R community?

tbates's picture
Offline
Joined: 07/31/2009
Hi Ryne, I thought CIs will

Hi Ryne,

I thought CIs will have to be part of mxModel, as asking for confidence intervals causes the optimiser to fit the model, then move each (requested) parameter estimate down (and up), holding the others constant, until fit declines by some number of units related to the confidence requested. But perhaps it could be handled in a separate mxEval-like mxCI() function, which would take the model and current parameters as input?

Steve's picture
Offline
Joined: 07/30/2009
Right you are. It's going to

Right you are. It's going to need to be handed to the optimizer. The question is how should we do that?

tbates's picture
Offline
Joined: 07/31/2009
I would say that if a model

I would say that if a model contains an mxCI statement, then a flag is set and when mxRun is finished running the model (i.e., when it would have returned), that flag is checked, and if true, any mxCI statements are executed by forming all the CIs requested into an array, then looping over them: setting the cell to fixed, and then sweeping the value up and down until the fit degrades to the request chi2 change.

The code for choosing new values and homing in on the point where the fit degrades below the desired CI must be in old mx for something to work off.

neale's picture
Offline
Joined: 07/31/2009
Yes, the Mx1 code is there.

Yes, the Mx1 code is there. What it does is to change the fit function so that there is a constraint that the -2lnL is 3.84 (in the case of 95%) higher than it was when the model was initially fitted, and either + or - the parameter in question. That is, we maximize or minimize the parameter in question, subject to the constraint that the -2lnL is worse by some predetermined amount. All parameters are free as in the original model.

Possibly, these revised fit functions won't be too bad to implement with an Algebra type fit function.

For example, using the UnivariateTwinAnalysis_MatrixRaw.R script from the demo directory, we can obtain the likelihood-based CI's as follows:

CIaupper<-mxModel(twinACEFit,mxMatrix("Full",values=mxEval(objective,twinACEFit),name="oldfit"),mxAlgebra(((oldfit+3.84)-(MZ.objective+DZ.objective))^2-A,name="upperCIa"),mxAlgebraObjective("upperCIa"))

CIalower<-mxModel(twinACEFit,mxMatrix("Full",values=mxEval(objective,twinACEFit),name="oldfit"),mxAlgebra(((oldfit+3.84)-(MZ.objective+DZ.objective))^2+A,name="lowerCIa"),mxAlgebraObjective("lowerCIa"))

then running
runCIalower<-mxRun(CIalower)
runCIalower@algebras$A
runCIaupper<-mxRun(CIaupper)
runCIaupper@algebras$A

which are the lower and upper CI's on the computed matrix element A[1,1]

Obviously, making this an automated function would be much nicer for the user! But it's pretty neat that it can be done without any programing on behalf of the OpenMx team.

tbates's picture
Offline
Joined: 07/31/2009
Thanks mike: those are

Thanks mike: those are helpful user functions!

I think it would still be very well worth while to support this core functionality inside an openMx-package supported mxInterval() function, able to be added to an mxModel, calling itself recursively, or as a parameter of mxRun

t

tbates's picture
Offline
Joined: 07/31/2009
Just a thought too: The

Just a thought too: The univariate example has only a 1*1 matrix to attend too... It would take a lot of each-time programming to scale up to multivariate.

Would be good to have the ability to specify a range of cells to get CIs on.

and to be able to specify CIs on RAM mxPaths by c(name,)

neale's picture
Offline
Joined: 07/31/2009
ML CI's are inherently

ML CI's are inherently computationally intensive, requiring two new optimizations over as many parameters as there are in the model for each interval requested. CI's on a model with 50 parameters (or functions of parameters) would take 100 times as long as the original model. If there are lots of CI's, and raw data are input, then it can be more efficient to use the bootstrap. Here's an example using the dataset from the OpenMx homepage:

require(boot)
mles<-function(dataset,wt){
manifests <- names(dataset)
latents <- c("G")
covwt<-cov.wt(dataset,wt)
mlevals<-mxRun(mxModel("One Factor", type="RAM",
manifestVars = manifests,
latentVars = latents,
mxPath(from=latents, to=manifests),
mxPath(from=manifests, arrows=2),
mxPath(from=latents, arrows=2, free=F, values=1.0),
mxData(covwt$cov, type="cov",
numObs=500)))
return(as.vector(mlevals@output$estimate))}
boot(demoOneFactor,mles,R=1000)

Possibly a wrapper function would facilitate joint use of the boot and OpenMx packages. Note that the mlevals in the above example might be augmented to have the results of algebras included in the return statement.

a9mike's picture
Offline
Joined: 06/28/2013
CIs by bootstrapping

Could you clarify this syntax for me a bit? I'm constructing a bifactor model I need CIs for and mxCI is impossible to run (raw data from HRS, large and lots of missing data). I'm just not sure how to incorporate the boot function.

AttachmentSize
refined CSS Model 2.R 4.68 KB
smedland's picture
Offline
Joined: 08/04/2009
could someone please post a

could someone please post a working CI demo script that includes CIs on both an estimated and a calculated element?
thanks

neale's picture
Offline
Joined: 07/31/2009
Same here. Having svn upped

Same here. Having svn upped this morning, make install, restart R, the CI's are being very coy. OpenMx seems to be doing the work but discarding the CI's from summary and output. See below, non-compliant script attached.

> require(OpenMx)
> data(demoOneFactor)
> factorModel <- mxModel("One Factor",
+ mxMatrix("Full", 5, 1, values=0.2,free=T, name="A"),
+ mxMatrix("Symm", 1, 1, values=1,free=F, name="L"),
+ mxMatrix("Diag", 5, 5, values=1,free=T, name="U"),
+ mxAlgebra(A %*% L %*% t(A) + U, name="R"),
+ mxCI(c('A[1,1]','R[1,1]')),
+ mxMLObjective("R", dimnames = names(demoOneFactor)),
+ mxData(cov(demoOneFactor), type="cov", numObs=500))
> factorModelOuput <- (mxRun(factorModel,intervals=T))
Running One Factor
Warning messages:
1: In model 'One Factor' while computing the lower bound for 'One Factor.A[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
2: In model 'One Factor' while computing the lower bound for 'One Factor.R[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
3: In model 'One Factor' while computing the upper bound for 'One Factor.A[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
4: In model 'One Factor' while computing the upper bound for 'One Factor.R[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)
> factorModelSummary <- summary(factorModelOutput)
> factorModelOutput@output$confidenceIntervals
[,1] [,2]
>

And that's it. Summary is no help either.

AttachmentSize
CI test.R 563 bytes
mspiegel's picture
Offline
Joined: 07/31/2009
Umm, I'm getting confidence

Umm, I'm getting confidence intervals.

confidence intervals:
                     lbound estimates    ubound
One Factor.A[1,1] 0.3679342 0.3971525 0.4290522
One Factor.R[1,1] 0.1758132 0.1985443 0.2253709

R 2.11.0 and OpenMx subversion revision 1297. I'll logon to some other machines and try running it on older versions of R. Works on R 2.9.2 and 2.10.1 also.

neale's picture
Offline
Joined: 07/31/2009
I don't think it's an R

I don't think it's an R version thing - I am at 2.10.1

?OpenMx indicates Package OpenMx version 0.3.3-1264
but I suspect that the version number is only updated with binary releases. It coincides with what I get during the build:
tar: trunk/build/OpenMx_0.3.3-1264.tar: Can't add archive to itself

and I'm pretty sure the building is going according to plan:

Michael-Neales-Macbook-Pro-4% svn up
At revision 1297.
Michael-Neales-Macbook-Pro-4% cd trunk
Michael-Neales-Macbook-Pro-4% sudo make install

I retried this procedure after quitting R, then fired up R again and still no CI's in the summary:

> require(OpenMx)
> data(demoOneFactor)
> factorModel <- mxModel("One Factor",
+ mxMatrix("Full", 5, 1, values=0.2,free=T, name="A"),
+ mxMatrix("Symm", 1, 1, values=1,free=F, name="L"),
+ mxMatrix("Diag", 5, 5, values=1,free=T, name="U"),
+ mxAlgebra(A %*% L %*% t(A) + U, name="R"),
+ mxCI(c('A[1,1]','R[1,1]')),
+ mxMLObjective("R", dimnames = names(demoOneFactor)),
+ mxData(cov(demoOneFactor), type="cov", numObs=500))
> factorModelOuput <- (mxRun(factorModel,intervals=T))
> factorModelSummary <- summary(factorModelOutput)
> factorModelSummary
data:
$`One Factor.data`
$`One Factor.data`$cov
x1 x2 x3 x4 x5
x1 0.1985443 0.1999953 0.2311884 0.2783865 0.3155943
x2 0.1999953 0.2916950 0.2924566 0.3515298 0.4019234
x3 0.2311884 0.2924566 0.3740354 0.4061291 0.4573587
x4 0.2783865 0.3515298 0.4061291 0.5332788 0.5610769
x5 0.3155943 0.4019234 0.4573587 0.5610769 0.6703023

free parameters:
name matrix row col Estimate Std.Error
1 A x1 G 0.39715212 0.015549769
2 A x2 G 0.50366111 0.018232514
3 A x3 G 0.57724141 0.020448402
4 A x4 G 0.70277369 0.024011418
5 A x5 G 0.79624998 0.026669452
6 S x1 x1 0.04081419 0.002812717
7 S x2 x2 0.03801999 0.002805794
8 S x3 x3 0.04082718 0.003152308
9 S x4 x4 0.03938706 0.003408875
10 S x5 x5 0.03628712 0.003678561

observed statistics: 15
estimated parameters: 10
degrees of freedom: 5
-2 log likelihood: -3648.281
saturated -2 log likelihood: -3655.665
number of observations: 500
chi-square: 7.384002
p: 0.1936117
AIC (Mx): -2.615998
BIC (Mx): -11.84452
adjusted BIC:
RMSEA: 0.03088043
timestamp: 2010-07-01 10:24:30
frontend time: 0.08384323 secs
backend time: 0.01688194 secs
independent submodels time: 5.698204e-05 secs
wall clock time: 0.1007822 secs
cpu time: 0.1007822 secs
openmx version number: 0.3.3-1264

Weird, huh?

tbates's picture
Offline
Joined: 07/31/2009
Same errors as Mike. I'm on

Same errors as Mike. I'm on "R version 2.11.0 (2010-04-22)"
svn info tells me: Revision: 1298

> data(demoOneFactor)
> factorModel <- mxModel("One Factor",
+ 	mxMatrix("Full", 5, 1, values=0.2,free=T, name="A"),
+ 	mxMatrix("Symm", 1, 1, values=1,free=F, name="L"),
+ 	mxMatrix("Diag", 5, 5, values=1,free=T, name="U"),
+ 	mxAlgebra(A %*% L %*% t(A) + U, name="R"),
+ 	mxCI(c('A[1,1]','R[1,1]')),
+ 	mxMLObjective("R", dimnames = names(demoOneFactor)),
+ 	mxData(cov(demoOneFactor), type="cov", numObs=500)
+ )
> factorModelOuput <- (mxRun(factorModel,intervals=T))
Running One Factor 
Warning messages:
1: In model 'One Factor' while computing the lower bound for 'One Factor.A[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED) 
2: In model 'One Factor' while computing the lower bound for 'One Factor.R[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED) 
3: In model 'One Factor' while computing the upper bound for 'One Factor.A[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED) 
4: In model 'One Factor' while computing the upper bound for 'One Factor.R[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED) 
> factorModelSummary <- summary(factorModelOutput)
Error: object 'factorModelOutput' not found
Error in summary(factorModelOutput) : 
  error in evaluating the argument 'object' in selecting a method for function 'summary'
> factorModelOutput@output$confidenceIntervals
Error: object 'factorModelOutput' not found

neale's picture
Offline
Joined: 07/31/2009
Er, those seem like different

Er, those seem like different errors. At least I get factorModelOutput to exist. It just lacks the confidence intervals bit. Trying nuclear option:
rm(list = ls())
DOES help.

Now I get the errors reported by Tim! But I am able to see the CI's if I don't go via the summary object. So basically the CI's are causing the creation of a summary object to fail, and my earlier confusion was caused by my inspection of a historical summary object of the same name (created by an earlier version of OpenMx which didn't have this bug).

tbates's picture
Offline
Joined: 07/31/2009
Yes, having something fail

Yes, having something fail and leave you with the _last_ result in place of the new one you requested is such a trap I've wondered if mxRun shouldn't output a simple error string when it fails to at least stamp on the old output.

mspiegel's picture
Offline
Joined: 07/31/2009
Copy/pasting Mike's input

Copy/pasting Mike's input yields an error because of the typo:factorModelOuput <- (mxRun(factorModel,intervals=T)). The "t" is missing in Output.

neale's picture
Offline
Joined: 07/31/2009
Oops! Sorry. Much ado about

Oops! Sorry. Much ado about nothing. However, glad to have it resolved and to be able to attach a working script as a correct response to the inquiry from Sarah for a script with CI's on both free parameter matrix elements and mxAlgebra() computed elements.

AttachmentSize
CI test.R 584 bytes
neale's picture
Offline
Joined: 07/31/2009
Agreed - syntax of the type

Agreed - syntax of the type you suggest would be a boon to the user. Any takers for writing wrapper functions like this?

Second, on the issue of computation time, you are right, likelihood-based CI's are expensive. They can be done in parallel though, so perhaps this is another area where Drs. Wilde & Kenny may be able to help. With a 48 core laptop on the horizon, it's tempting indeed to make this example run locally in parallel.

svrieze's picture
Offline
Joined: 01/03/2010
With the zero influence I

With the zero influence I have, I second the notion of CIs in parallel. I'm waiting for hours and hours to get CIs on a common pathway ACE model. A call to snowfall and a host of independent models to calculate CIs would be absolutely fabulous!

-Scott

mspiegel's picture
Offline
Joined: 07/31/2009
Ah yes the model naming

Ah yes the model naming problem is complicated when independent submodels are used. Here's a function that will take a model and all its submodels and add a prefix to their model names:

modelNameAddPrefix <- function(model, prefix) {
   if (missing(model) || !is(model, "MxModel")) {
      stop("first argument must be a MxModel object")
   }
   if (missing(prefix) || length(prefix) != 1 || !is.character(prefix)) {
      stop("second argument must be a character string")
   }
   model <- modelNameAddPrefixHelper(model, model, prefix) 
   return(model)
}
 
modelNameAddPrefixHelper <- function(job, model, prefix) {
   job <- mxRename(job, paste(prefix, model@name, sep = ''), model@name)
   if (length(model@submodels) > 0) {
      for(i in 1:length(model@submodels))  {
          job <- modelNameAddPrefixHelper(job, model@submodels[[i]], prefix)
      }
   }
   return(job)
}

So when I tested it using foo1 <- modelNameAddPrefix(twinACEModel, "foo1_") the resulting model is named 'foo1_twinACE' and has three submodels 'foo1_common', 'foo1_MZ', 'foo1_DZ'. It's not exactly user-friendly at the moment, but it will help in running the confidence intervals.

Oh, plus I found a bug in the function mxRename(). It will currently not rename the appropriate confidence intervals, because confidence intervals had not been implemented when mxRename() was written. But you should be specifying the confidence intervals manually after creating all the submodels for this particular workflow.

UPDATE: Oops, I forgot that child models could refer to entities in the parent models. I had to change the recursion step to apply to the entire nested model. The code has been updated.

mspiegel's picture
Offline
Joined: 07/31/2009
Ah, that would be useful. In

Ah, that would be useful. In the meantime, you could manually setup independent submodels in order to take advantage of snowfall. Be sure to place only a single confidence interval specification in each submodel. The workflow would be something like:

model <- mxModel(....)
modelOut <- mxRun(model)
library(snowfall)
sfInit(parallel = TRUE, cpus = ???)
sfLibrary(OpenMx)
intervals1 <- mxModel(modelOut, mxCI('a'), name = "intervals1", independent = TRUE)
intervals2 <- mxModel(modelOut, mxCI('b'), name = "intervals2", independent = TRUE)
intervals3 <- mxModel(modelOut, mxCI('c'), name = "intervals3", independent = TRUE)
intervals4 <- mxModel(modelOut, mxCI('d'), name = "intervals4", independent = TRUE)
intervals5 <- mxModel(modelOut, mxCI('e'), name = "intervals5", independent = TRUE)
topModel <- mxModel('topModel', intervals1, intervals2, intervals3, intervals4, intervals5)
results <- mxRun(topModel, intervals = TRUE)
sfStop()

UPDATE: I forgot to give the submodels new names. The name collision would have resulted in only one submodel stored as a child of the topModel.

svrieze's picture
Offline
Joined: 01/03/2010
That's a great idea, thanks!

That's a great idea, thanks! Given that you replied to my post about 3 minutes after I posted it, I was able to try this last night. Unfortunately I ran into the renaming problem.

The renaming of submodels is a problem, to be sure. My ACE common pathway model has several submodels ('ACE', 'DZ','MZ'), and renaming them for each confidence interval is time-consuming. It seems some kind of built-in parallelizing confidence interval function would save a lot of people a lot of time. As others are saying, reporting confidence intervals is (and should) becoming routine. Each submodel could be automatically renamed to make it easier on the user (e.g., for the first confidence interval requested it could have 'ACE1', 'DZ1', 'MZ1', and so on).

Just thinking out loud. THanks for an extremely useful product thusfar!

-Scott

Matifou's picture
Offline
Joined: 12/08/2009
Hey guys, I just discovered

Hey guys, I just discovered now OpenMx so should rather shut up:-)

But is it a goal in OpenMx to keep names that probably come from MX syntax like mxInterval?

If no, you could maybe consider R generic functions intuitively known to users such as confint()?

Steve's picture
Offline
Joined: 07/30/2009
From the standpoint of the

From the standpoint of the mxMatrix, we would now need two other matrices added to the mxMatrix's already lengthy list of matrices, an upper CI and lower CI matrix in the most general case. On top of that, we'd need a slot for the CI type. This needs some discussion in the team meetings.

We also would need to provide pathic versions of the interface.

tbates's picture
Offline
Joined: 07/31/2009
I guess the CI matrices would

I guess the CI matrices would be added only when CIs are requested.

In that case, instead of a type slot, might be best allow both sets of CI matrices to be added: that way people could say

mxCI(z.A, "SE");
mxCI(z.A, "ML");

and get both kinds to compare etc.

neale's picture
Offline
Joined: 07/31/2009
Agreed, and I like the syntax

Agreed, and I like the syntax you suggest. We might make SE the default since it is faster and when statistical power is sufficient it agrees quite well with the likelihood based approach.