NAN's as standard error

8 replies [Last post]
Julia's picture
Offline
Joined: 03/29/2012

Hi.
I am running a bivariate sex-limitation model with an age variable as a moderator on both means and variances/covariances. All models (full and nested) seem to work pretty well, estimates are reasonable and there is no error while converging (not even code GREEN). However, when I look at the standard erros, I can see that some of them turn out to be NAN's. What could be a reason of it and how can I fix it?

Thank you beforehand.
Julia

tbates's picture
Offline
Joined: 07/31/2009
script?

maybe post the script so the problem (and the model :-) ) can be identified?

Julia's picture
Offline
Joined: 03/29/2012
script

Would greatly appreciate if you could spot the problem.

AttachmentSize
twinMultHetCovAcePart.R 18.65 KB
neale's picture
Online
Joined: 07/31/2009
Bound C as well?

It is necessary to bound c's as well because they also contribute to covariance across DZOS pairs, so

# Matrices declared to store a, c, and e Path Coefficients
pathAm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathAValM, lbound=c(0, NA, 0), labels=amLabs, name="am" ) 
pathCm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathCValM, labels=cmLabs, name="cm" )
pathEm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathEValM, labels=emLabs, name="em" )
 
pathAf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathAValF, lbound=c(0, NA, 0), labels=afLabs, name="af" ) 
pathCf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathCValF, labels=cfLabs, name="cf" )
pathEf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathEValF, labels=efLabs, name="ef" )

should become
# Matrices declared to store a, c, and e Path Coefficients
pathAm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathAValM, lbound=c(0, NA, 0), labels=amLabs, name="am" ) 
pathCm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathCValM, lbound=c(0, NA, 0), labels=cmLabs, name="cm" )
pathEm    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathEValM, labels=emLabs, name="em" )
 
pathAf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathAValF, lbound=c(0, NA, 0), labels=afLabs, name="af" ) 
pathCf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathCValF, lbound=c(0, NA, 0), labels=cfLabs, name="cf" )
pathEf    <- mxMatrix( "Lower", nrow=nv, ncol=nv, free=TRUE, values=pathEValF, labels=efLabs, name="ef" )

However, please see the paper on sex-limitation models because there are issues with Cholesky decomposition in this context. For example, changing the order of the variables may change the fit of the model!

In addition, the moderation of all paths in the Cholesky is perhaps not the right way to go. I did present a paper at the twin meeting some years ago (2007 I think) describing this issue, but mea culpa I have not published it. It seems to me that moderating the variance of the latent variables rather than the paths makes more sense, and generalizes more naturally to the multivariate case.

Possibly, you are suffering from empirical under identification. Suppose that there is no evidence for shared environment in the data. Attempting to moderate a variance component which is essentially zero would quite possibly generate under identification symptoms. You did say that none of the parameters are at their bounds, but the problem may exist with the covariance components. If you are moderating say an off-diagonal element of the A matrix but there is no evidence for additive genetic factors contributing to covariance between the traits, the moderator parameter may be only weakly identified. As a way forward you might consider a slightly less full model, in which some of the moderator effects, and perhaps some of the variance components are fixed at zero.

Lastly, you might even be able to make an educated guess as to which parameters are problematic. If you look at the eigenvectors corresponding to the smallest (and negative) eigenvalues, the largest values in the eigenvector would indicate the parameters most associated with this component of the (co)variation in the parameters.

Sorry to go on at such length, but neither your model nor your problem is very simple. I hope these comments help!

Julia's picture
Offline
Joined: 03/29/2012
Thank you very much for such

Thank you very much for such a detailed answer, Mike! I guess I was really not aware in the beginning of complexity of this model.
I ran the analysis separately for males and females in order to find non-significant parameters and pass these constraints to the full model. You were right: there were a couple of paths that could be eliminated together with their moderation; in addition moderation of a few extra paths did not seem to be significant. All this, of course, will simplify the model from now on. However, analysis of only same-sex pairs indicated that the genetic correlation between the traits is different for males and females, pointing I guess on that I have to try a non-scalar sex-limitation model. Another headache :)

neale's picture
Online
Joined: 07/31/2009
Add a constraint

As discussed in the BG paper, I think the best way to proceed is to constrain the (genetic, shared environmental) covariances between traits to be equal for males and females. If the correlations are unequal, then the Cholesky model is in pretty bad shape because effectively the factors are not the same thing in the two sexes.

I sympathize, data often have a habit of behaving badly...

neale's picture
Online
Joined: 07/31/2009
Hard to say

Without seeing the output in detail, it's pretty difficult to say why this has happened. Normally it might reflect under-identification of the model, but the lack of other diagnostics from optimization suggests that it is not. A few things to check

Are any of the parameters at their bounds?

Are the gradients near zero? Check in the fitted model object (which I'll call fittedModel here) with fittedModel@output$gradient

I suspect that the calculated Hessian (output$calculatedHessian) looks a bit odd as well. You could look at its eigenvalues with

mxEval(eigen(fittedModel@output$calculatedHessian),fittedModel)

If any of the eigenvalues are zero or negative, there is obviously a problem, perhaps stemming from under identification or boundary conditions

One thing you might try is to refit the model from its solution, to see if the gradients get smaller, and the NaN's clear up.

fittedModel2 <- mxRun(fittedModel)

Julia's picture
Offline
Joined: 03/29/2012
Hi Mike, Thank you for you

Hi Mike,
Thank you for you reply. None of the parameters to be estimated have boundary values. But there indeed seems to be a problem with eigenvalues. Two of them were negative (although all diagonal elements of Hessian were positive and gradients were small):

> Age2EqualHeightFit@output$gradient
         am11          am21          am22          cm11          cm21          cm22          em11          em21          em22       amMod11       amMod21       amMod22       cmMod11 
-1.122870e-03 -1.237641e-03  1.707861e-03 -6.748615e-04  1.345188e-04 -6.742563e-04 -6.320216e-04 -8.992094e-04  1.245466e-03  1.434835e-03 -8.626526e-04 -6.477156e-05 -8.695995e-04 
      cmMod21       cmMod22       emMod11       emMod21       emMod22   heightmeanM   weightmeanM    betaModMP1    betaModMP2    betaMod2P1   betaMod2MP2          af11          af21 
 6.148197e-05 -3.626750e-05 -8.803194e-04 -4.461401e-04  1.387117e-04 -1.161187e-03  2.858647e-05 -1.060539e-03 -1.806863e-04  4.771032e-04 -5.274610e-04 -1.980755e-03  1.789363e-03 
         af22          cf11          cf21          cf22          ef11          ef21          ef22       afMod11       afMod21       afMod22       cfMod11       cfMod21       cfMod22 
-1.705792e-03 -1.292334e-04 -3.007552e-04 -7.127426e-04 -3.551370e-04  5.873707e-04 -2.404748e-04  7.272490e-04 -4.315048e-04 -7.615971e-04  1.430941e-03 -4.446400e-04 -3.206717e-04 
      efMod11       efMod21       efMod22   heightmeanF   weightmeanF    betaModFP1    betaModFP2   betaMod2FP2 
-1.649077e-03  0.000000e+00 -1.469744e-03  1.684205e-04  5.093047e-04  1.184870e-03  2.028324e-04  1.245997e-03 

> diag(Age2EqualHeightFit@output$calculatedHessian)
       am11        am21        am22        cm11        cm21        cm22        em11        em21        em22     amMod11     amMod21     amMod22     cmMod11     cmMod21     cmMod22     emMod11     emMod21     emMod22 heightmeanM 
 2385.43855   446.25449   618.70357   121.57020    42.85306    74.38829  3094.20981   260.97436   757.93768  1999.29675   388.88075   521.36922   225.61561    73.83541    83.97514  4191.12740   328.71957   692.43717  1543.42476 
weightmeanM  betaModMP1  betaModMP2  betaMod2P1 betaMod2MP2        af11        af21        af22        cf11        cf21        cf22        ef11        ef21        ef22     afMod11     afMod21     afMod22     cfMod11     cfMod21 
  494.87906  1396.98944   453.57463  7029.29868  1158.60257  3125.31498   521.32441   734.14229   224.85520    68.99024    86.22692  4937.08868   346.52302   969.64112  2954.36567   502.06249   716.57757   383.38246   110.76349 
    cfMod22     efMod11     efMod21     efMod22 heightmeanF weightmeanF  betaModFP1  betaModFP2 betaMod2FP2 
   75.22758  6920.85078   490.98650   961.82776  1908.58993   576.90319  1842.77287   584.21827  1484.84240

> mxEval(eigen(Age2EqualHeightFit@output$calculatedHessian), Age2EqualHeightFit)
$values
 [1] 8619.195716 8027.120433 5011.943965 4828.560306 3652.827066 2965.539319 2439.374194 2040.002692 1909.280972 1887.465043 1568.978182 1429.407210 1414.147226 1392.439492 1321.954465 1209.877815 1063.231914 1028.387112
[19]  855.471722  689.470690  667.402007  505.723903  481.788298  469.447431  455.717385  371.851451  329.285816  303.815932  265.032984  248.148131  234.617721  205.030319  181.580511  162.688532  149.369719  127.531434
[37]  121.046768   80.215224   62.553567   41.763661   39.530890   24.836921   20.440747   13.613198    7.167862   -4.881082   -8.160895

When I refitted the model starting from the original model solution, the problem with NaN's was still there (I attach summary outputs for both models):

> Age2EqualHeightFit2@output$gradient
         am11          am21          am22          cm11          cm21          cm22          em11          em21          em22       amMod11       amMod21       amMod22       cmMod11       cmMod21       cmMod22       emMod11 
-4.955332e-03 -2.787743e-03 -2.197187e-04 -3.377597e-03 -4.898177e-04 -1.486063e-03 -5.069344e-03 -2.108801e-03 -1.136539e-03 -9.095739e-04 -2.089097e-03 -1.895921e-03 -2.587429e-03 -8.310529e-04 -1.392584e-03 -5.755695e-03 
      emMod21       emMod22   heightmeanM   weightmeanM    betaModMP1    betaModMP2    betaMod2P1   betaMod2MP2          af11          af21          af22          cf11          cf21          cf22          ef11          ef21 
-2.009866e-03 -1.943104e-03 -4.568592e-03 -1.747756e-03 -4.006359e-03 -2.064840e-03 -4.506982e-03 -2.869849e-03 -6.961889e-03 -7.177681e-05 -4.108148e-03 -2.738653e-03 -8.392767e-04 -1.583976e-03 -6.517562e-03 -9.059191e-04 
         ef22       afMod11       afMod21       afMod22       cfMod11       cfMod21       cfMod22       efMod11       efMod21       efMod22   heightmeanF   weightmeanF    betaModFP1    betaModFP2   betaMod2FP2 
-2.852476e-03 -2.634315e-03 -1.791248e-03 -3.041133e-03 -3.791626e-03 -1.184226e-03 -1.398853e-03 -8.701540e-03 -1.668872e-03 -3.977844e-03 -3.511259e-03 -1.314040e-03 -2.527218e-03 -1.916431e-03 -1.625743e-03 
> diag(Age2EqualHeightFit2@output$calculatedHessian)
       am11        am21        am22        cm11        cm21        cm22        em11        em21        em22     amMod11     amMod21     amMod22     cmMod11     cmMod21     cmMod22     emMod11     emMod21     emMod22 heightmeanM 
 2385.94300   447.17453   618.52388   128.16410    46.74357    78.90563  3095.84240   274.32086   758.24178  2036.36399   394.32259   535.98857   238.05728    84.01778    92.57487  4209.88195   361.54460   690.35582  1543.42482 
weightmeanM  betaModMP1  betaModMP2  betaMod2P1 betaMod2MP2        af11        af21        af22        cf11        cf21        cf22        ef11        ef21        ef22     afMod11     afMod21     afMod22     cfMod11     cfMod21 
  494.87903  1405.61255   463.60206  7045.22101  1166.33280  3126.23691   525.13732   734.70751   223.42529    62.45366    86.58439  4941.50213   357.01748   970.89067  2949.16649   514.24203   691.10318   383.08442   111.21603 
    cfMod22     efMod11     efMod21     efMod22 heightmeanF weightmeanF  betaModFP1  betaModFP2 betaMod2FP2 
   71.89960  6915.00156   496.59099   981.79701  1908.58788   576.91151  1837.66380   586.17848  1513.17637 
> mxEval(eigen(Age2EqualHeightFit2@output$calculatedHessian), Age2EqualHeightFit2)
$values
 [1] 8621.781552 8046.352379 5025.469679 4826.962360 3654.576721 2965.701121 2444.974007 2038.558554 1933.447459 1881.578111 1574.044965 1444.397052 1434.723289 1405.800788 1317.506114 1211.857018 1063.401905 1036.633376
[19]  859.073819  707.254807  663.989849  505.777275  486.935662  473.635529  469.038981  380.132321  331.879040  308.068325  271.714133  269.291484  251.186070  197.245198  182.993207  163.838670  160.495495  146.937096
[37]  123.205381   85.347551   61.500005   47.613701   40.039015   31.089793   24.097518   11.178646    3.314551   -1.618250  -22.407141

However, when I refitted model one more time (Age2EqualHeightFit3 <- mxRun(Age2EgualHeightFit2)) NaN's disappeared without any improvement of -2LL and still one eigenvalue was negative. Is it reasonable to refit the model so many times? Would it still be a problem of having one negative eigenvalue when having all SE's as non-NaN's? Maybe I was originally wrong by setting observed means and variances as starting values?

AttachmentSize
Age2EqualHeightFit.pdf 27.04 KB
neale's picture
Online
Joined: 07/31/2009
Sounds like identification issue

When you say it is a bivariate sex-limitation model, you mean it is of the ACE variety? If so, there is some concern about your statement that none of the parameters have bounds. Even in the univariate case this could be a problem if you have DZ opposite sex pairs. For example, the DZOS covariance would be .5 am * af + cm * cf. Now, suppose that - as often is the case - the DZOS observed covariance is a bit lower than it would be expected to be, based on the covariances of the DZM and DZF pairs. It would be possible for the optimizer to find a negative value of am and a positive value of af, which would leave the expected covariances of DZM and DZF pairs unchanged, yet reduce the DZOS covariance by am * af. This problem would be exacerbated if the genetic correlation between males and females (the .5 here) is also allowed to be less than .5. Note that a model with both rG and rC free would not be identified because opposite sex MZ pairs almost never occur in nature (some pairs discordant for Turner's syndrome being an exception). Anyway, if you give a lower bound of zero to the am and af paths if you have set up a covariance-style sex-limitation model, or to the diagonals *only* of the Cholesky matrices if you are using this approach (output suggests you are, but please see http://www.vipbg.vcu.edu/vipbg/Articles/twinres-2006-sexlimgxe.pdf for other problems with it) the problem may disappear. It is possible that the fit of the model will deteriorate a bit, because it can no longer 'cheat' by setting, e.g., am positive and af negative.