Fri, 10/07/2011 - 20:03

With the release of OpenMx 1.1, I started playing around with the new joint ordinal-continuous feature. We had just done a bivariate ACE model for two phenotypes (one ordinal, 3 levels, and one continuous, which we divided into 10 levels) using Classic Mx.

At first glance the results look good. The script seems to run without any problems, and the resulting variance component estimates are comparable to what we see in the univariates - there is some C effect on the ordinal variables that wasn't there in the univariate analysis, but this appears to be due to the C effect on the continuous variable.

What I'm concerned about is the movement that we see in the thresholds for the ordinal phenotype. In some cases the movement is rather small, nothing that I wouldn't attribute to simple differences between univariate and multivariate models. There are other cases, however, where the thresholds more quite a lot. For example, in one model, the thresholds are estimated at 2.59 for T1 and 4.64 for T2 while in the univariate model these are estimated at 1.04 for T1 and 1.81 for T2.

Since this is a new script for me, I'm not sure how things are working. Are differences like this to be expected with this script, or do we need to reconsider our start values and boundaries?

Also, I noticed (as I ran this same bivariate ACE model for 5 ordinal variables against the same continuous variable) that sometimes it would give me NaN standard errors for my parameters and sometimes it wouldn't. Is that also something expected in the joint script?

My joint ordinal-continuous bivariate ACE script is below (probably a bit kludgy):

bivACEModel <- mxModel("ACE",

# Matrices a, c, and e to store a, c, and e path coefficients

mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=cholASVnv, lbound=-4, ubound=4,labels=AFac, name="a" ),

mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=cholCSVnv, lbound=-4, ubound=4,labels=CFac, name="c" ),

mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=cholESVnv, lbound=-4, ubound=4,labels=EFac, name="e" ),

# Matrices A, C, and E compute variance components

mxAlgebra( expression=a %*% t(a), name="A" ),

mxAlgebra( expression=c %*% t(c), name="C" ),

mxAlgebra( expression=e %*% t(e), name="E" ),

# Algebra to compute total variances and standard deviations (diagonal only)

mxAlgebra( expression=A+C+E, name="V" ),

mxMatrix( type="Iden", nrow=nv, ncol=nv, name="I"),

mxAlgebra( expression=solve(sqrt(I*V)), name="iSD"),

# Confidence intervals portion for covariance matrices

mxAlgebra( expression=A/V,name="stndVCA"),

mxAlgebra( expression=C/V,name="stndVCC"),

mxAlgebra( expression=E/V,name="stndVCE"),

mxCI(c("stndVCA","stndVCC","stndVCE")),

# Confidence intervals for Phenotypic correlation pieces

mxAlgebra((solve(sqrt(I*V)) %*% V %*% solve(sqrt(I*V)))[1,2],name="Vcorr"),

mxAlgebra((solve(sqrt(I*A)) %*% A %*% solve(sqrt(I*A)))[1,2],name="Acorr"),

mxAlgebra((solve(sqrt(I*C)) %*% C %*% solve(sqrt(I*C)))[1,2],name="Ccorr"),

mxAlgebra((solve(sqrt(I*E)) %*% E %*% solve(sqrt(I*E)))[1,2],name="Ecorr"),

mxCI(c("Vcorr","Acorr","Ccorr","Ecorr")),

## Note that the rest of the mxModel statements do not change for bi/multivariate case

# Matrix & Algebra for expected means vector

mxMatrix( type="Full", nrow=1, ncol=nv, free=c(TRUE,FALSE), values=meanSVnv, labels=mean, name="Mean" ),

mxAlgebra( expression= cbind(Mean,Mean), name="expMean"),

mxMatrix(type="Full", nrow=2, ncol=nv, free=c(TRUE,TRUE), values=c(1,1.8),lbound=-4, ubound=6, name="expThre", labels=c("th1","th2"),dimnames=list(c('th1','th2'),selVars[c(2,4)])),

# Algebra for expected variance/covariance matrix in MZ

mxAlgebra( expression= rbind ( cbind(A+C+E , A+C),

cbind(A+C , A+C+E)), name="expCovMZ" ),

# Algebra for expected variance/covariance matrix in DZ

mxAlgebra( expression= rbind ( cbind(A+C+E , 0.5%x%A+C),

cbind(0.5%x%A+C , A+C+E)), name="expCovDZ" ),

mxModel("MZ",

mxData( observed=mzData, type="raw" ),

mxFIMLObjective( covariance="ACE.expCovMZ", means="ACE.expMean", dimnames=selVars, thresholds="ACE.expThre", threshnames=selVars[c(2,4)] )

),

mxModel("DZ",

mxData( observed=dzData, type="raw" ),

mxFIMLObjective( covariance="ACE.expCovDZ", means="ACE.expMean", dimnames=selVars, thresholds="ACE.expThre", threshnames=selVars[c(2,4)] )

),

mxAlgebra( expression=MZ.objective + DZ.objective, name="neg2sumll" ),

mxAlgebraObjective("neg2sumll")

)

bivACEFit <- mxRun(bivACEModel,intervals=TRUE)

bivACESumm <- summary(bivACEFit)

bivACESumm

Hi all,

I realise that the original post on this topic was from some time ago but I have an almost identical problem. The thresholds for my dichotomous variable (schizophrenia) are way off, they should be around 2.3 (according to my univariate schizophrenia model) but instead are estimated as 10 in the bivariate ordinal-continuous model which throws off my variance estimates. I'm an Mx user that only recently switched to OpenMx (in order to utilise the joint ordinal-continuous feature) so I have little understanding of the syntax for OpenMx, therefore could somebody please tell me if I am failing to fix the appropriate variances like the previous poster and if so could you also provide me with the specific line of code that I need to give OpenMx? I'd very much appreciate your help on this issue! :) I have attached a copy of my R script to this post.

Thank you,

Emma

One thing that jumps out at me relates to Ryne's comments. How have you set the scale for the ordinal variables? From the code, it looks like the thresholds are free, the means are free, and the variances are free. Some of these must be fixed to set the scale of the ordinal variables.

I've seen the scales set in the past with something like the following:

This would be added to the "ACE" model. It would fix the total variances to 1.0.

Alternatively, you could fix the thresholds in the joint model to the estimated values from the ordinal-only univariate model.

I've changed the free=TRUE to free=FALSE, and changed the values from 2.3 to the more exact estimated value (I made up what would be "more exact" here, but you've run the model so you'd just grab it from that run model.).

Cheers!

One approach to estimating variances with ordinal data (so as to avoid constraining the variance with a non-linear constraint which would kill off standard errors) would be to fix the first two thresholds at zero and one, respectively, as described by Mehta, Neale & Flay (2004). If there is no second threshold, then the variances must be fixed (and either the mean fixed at zero, or the threshold). Estimates may still look a bit weird - simply standardize the estimates by subtracting the mean and dividing by the square root of the estimated variance.

Mehta PD, Neale, M.C., Flay BR: Squeezing interval change from ordinal panel data: latent growth curves with ordinal outcomes. Psychol Methods 2004; 9:301-333 http://www.vipbg.vcu.edu/vipbg/Articles/psychmethods-squeezing-2004.pdf

Absent data or the models you're comparing this too, I don't think I can answer all of your questions. To clarify, you're comparing this bivariate model to univariate models of just the ordinal variable, estimated in OpenMx? Are you making additional comparisons to Classic Mx scripts?

Unless I'm missing something, you're failed to properly scale the ordinal variables. You've fixed the mean of the ordinal variables and are estimating the thresholds, but I don't see where you've fixed the variance of the ordinal variables. You have to fix two parameters relating to ordinal variable variance, mean and thresholds to provide some type of scale. I think the problems you're having come from this. Estimated variances and thresholds are inherently related; pretend that half the sample gives an ordinal value of 0, and a quarter each respond with 1 and 2, respectively (make them 1, 2 and 3 if that's how people responded). If you scale the continuous latent variable assumed to underlie an ordinal variable by giving it unit variance, then you'd expect thresholds of 0 (50th quartile of standard normal) and 0.67 (75th quartile). If you give that continuous latent variable a variance of 10, those thresholds become 0 and 6.7.

Hope this helps. In future submissions, try to attach code in a file (html can add weird formatting that R doesn't like) and some type of data, either real data, simulated data or something out of our fakeData function (available on the wiki). If we can run your code, we can help you more easily!

Thanks, Ryne! It does appear that's missing from my script.

Could this explain my randomly appearing/disappearing SEs?

Yep! Here's why.

Standard errors are based on the Hessian matrix, which is the (symmetric) matrix of second derivatives of the likelihood function with respect to the estimated parameters. This matrix represents the curvature of the likelihood space at the current set of estimates: higher curvature means that the likelihood value changes (goes up) a lot when we move the parameter in either direction, while lower values mean that the likelihood value doesn't change as much when we move the parameters. If a model is converged, we expect the gradient (a vector of first derivatives of the likelihood function) to be near zero and the Hessian to have positive values down the diagonal. We invert the Hessian to get the variance-covariance matrix of the parameters. The standard errors are the square root of the diagonal elements of that matrix. Therefore, high Hessian values mean low standard errors, and visa versa.

Convergence failures generally and failures due to underidentification specifically do weird things to the Hessian. We can get negative diagonal elements to the Hessian, indicating that we're not actually at a minimum of the -2 log likelihood (i.e., fit gets better as we move away from the current value). In the case of underidentification, we get zeroish values for the diagonal elements in the Hessian, indicating that other parameter values are equally good (like my example above, where we can get equivalent models with variances of 1 or 10). In the former case, the variance of some parameters will be negative, making the standard errors defined by the square root of a negative number. In the latter, inverting a matrix with a zero diagonal element yields an invalid or infinitely big variance. In either case, the standard errors are invalid in when convergence or identification are in doubt, and will pop in and out depending on the Hessian for a particular estimation.