Tue, 02/16/2010 - 04:54

Dear OpenMx community,

I'm rather new to the field of SEM, but I am very happy about OpenMx - the learning curve really was really steep (... in the sense of quick successes). Congratulations!

In some post there are already some hints about a "mxCompare" function. I tried to implement a likelihood ratio / chi square test myself:

The function does some minor error checking (e.g., whether all compared models have the same saturated likelihood, which should be an indicator that all models are nested). The functions orders models based on their dfs and compares models row by row (i.e., each model is compared to its direct predecessor; comparable to the anova-approach).

LRtest <- function(..., digits=3) { library(plyr) model_list <- list(...) if (length(model_list) < 2) stop("Error: At least two mx-models have to be provided!") res <- ldply(model_list, function(x) { summ <- summary(x) return(data.frame(model=x@name, AIC.Mx=summ$AIC.Mx, BIC.Mx=summ$BIC.Mx, RMSEA=summ$RMSEA, satL=summ$SaturatedLikelihood, m2LL=summ$Minus2LogLikelihood, chisq=summ$Chi, df=summ$degreesOfFreedom)) }) res <- res[order(res$df, decreasing=TRUE),] res$delta.chi <- NA res$delta.df <- NA res$p.value <- NA # check if the saturated likelihood is the same in all provided models (which should be the case if all are nested) if (any(res$satL != res$satL[1])) stop("Error: Not all models are nested!") for (m in 2:nrow(res)) { res$delta.df[m] <- res$df[m-1] - res$df[m] if (res$delta.df[m] > 0) { res$delta.chi[m] <- res$chisq[m-1] - res$chisq[m] res$p.value[m] <- pchisq(res$delta.chi[m], res$delta.df[m], lower.tail=FALSE) } } res[,-1] <- round(res[,-1], digits) return(res) }

A sample output would look like:

model AIC.Mx BIC.Mx RMSEA satL m2LL chisq df delta.chi delta.df p.value 1 m0 196.378 31.925 0.165 1136.021 1410.398 274.378 39 NA NA NA 2 m1 182.128 33.295 0.170 1136.021 1386.149 250.128 34 24.250 5 0.000 3 m2 64.193 -25.672 0.114 1136.021 1268.213 132.193 34 NA 0 NA 4 m3 60.728 -18.909 0.118 1136.021 1254.749 118.728 29 13.464 5 0.019 5 m4 45.251 -19.851 0.113 1136.021 1231.272 95.251 25 23.477 4 0.000 6 m5 47.574 -13.593 0.120 1136.021 1227.595 91.574 22 3.678 3 0.298 </CODE> (sorry, the columns are somewhat disaligned in the post) In the output, model m2 could not be compared to m1, as both have the same amount of dfs. I only work with RAM-style models, so I don't know whether this approach generalizes to matrix-style. I would be happy if some "pros" could comment on that (and give me a hint whether that approach is reasonable or even correct ;-) Best, Felix

so Just for people searching here for help in finding a comparison function: OpenMx comes with an mxCompare function as of March or so 2010 (versions around .3)

type ?mxCompare() to see how to use it, but it's easy.

My wishlist for improvements would be

1. ability to pass just the base model and get a table back

2. more control over formatting: its' hard to tell at a glance that "1.000000e+00" is p=1 (or some very much smaller number)

> mxCompare(fit1, c(fit2, fit3,fit4)) 9 34097.59 5692 22713.59 NA NA NA

base comparison ep minus2LL df AIC diffLL diffdf p

1 Full_G_by_E

2 Full_G_by_E G by E no quad 8 34097.59 5693 22711.59 -6.948540e-09 1 1.000000e+00

3 Full_G_by_E G by E, no means moderation 7 34116.90 5694 22728.90 1.930762e+01 2 6.418066e-05

4 Full_G_by_E moderation on µ but not ∂ 6 34099.06 5695 22709.06 1.466448e+00 3 6.900354e-01

Take a look at ?options and the 'digits' value in the R options. There is a warning about setting the 'digits' value in ?print.default.

Hi Felix

Nice going! I like the way you just throw a list of fitted mxModel objects at it. The checking of nestedness via saturated fit is definitely suitable for covariance matrix data input. However, with FIML the saturated model is not always fitted - it can be very expensive to do so, requiring optimization over all means and covariances. Therefore, I would recommend avoiding this check except for covariance matrices (see this thread http://openmx.psyc.virginia.edu/thread/89#comment-1310). A more general albeit weaker test would be to make sure that the number of observed statistics and the number of observations are the same.

I suspect the saturated likelihood would be NA for all such models so perhaps the check would pass and this potential limitation may be obscured.