Sat, 04/19/2014 - 18:31

Hi all,

I am trying to run both a phenotypic 3 latent factor cholesky and then an ACE Cholesky using the same factor structure and data. My second order F2 and F3 factors are correlated at about .92 and my phenotypic cholesky model runs fine if I allow the error terms of an indicator of F1 and an indicator of F3 to correlate within twin 1 and within twin 2 (to control for shared method variance). However, if I remove the correlations, then the correlation between F3 for twin 1 and F3 for twin 2 exceeds 1.0 and I got a messaging saying that the covariance matrix for these two factors is not positive definite. Thus, my first question is: Am I violating any assumptions of the cholesky models by allowing the error terms across F1 and F3 to correlate? It seems like this method works out well in my case because it somehow removes the multicollinearity issue and the error message goes away. However, I want to make sure that my method is conceptually sound and that it doesn't somehow screw up my ACE estimates.

Another related question is how come all the Fs factors in Cholesky Models usually not allowed to correlate (i.e. no double headed arrows above these factors)? In reality, these factors tend to correlate and in my case, very highly and yet, what I have seen so far (mostly in journal articles and Hermine Maes's power point slides on Multivariate Genetic Analysis) is that these Fs are assumed to be independent.

My third question is does it make a difference that I use both twins' data to conduct phenotypic choleskys vs. just using twin 1 (randomly selected) data? I have tried it both ways and obtained very similar estimates.

Thank you in advance for your insights on this,

Anne

A Cholesky factor model is essentially a saturated model for covariance structure. That is, with say n variables there are n(n+1)/2 observed covariances, and the same number of paths in the Cholesky parameter matrix. It will fit any positive definite covariance matrix perfectly. This answers question 2 - you cannot also correlate the latent Cholesky factors.

Question 1 reveals that a saturated model for latent factors is not sufficient to explain the correlations between the observed variables. Here it seems reasonable to allow for some cross-construct residual correlations. I would say that you are not violating assumptions (the Cholesky is merely saturating the factor correlation matrix - not the correlation matrix of the items).

Q3: yes it does - you get more precision using both twins' data. It is better to model pairwise - same cholesky for twin 1 & 2 but allow the cholesky factors to correlate across twins. This model may not fit perfectly:

LL' LDL'

L'DL LL'

with a diagonal D matrix for cross-twin correlations the model is not saturated any longer because there are n(n+1)/2 correlations in the off-diagonal block of twin covariances. But it is a reasonable place to start. If allowing symmetric D improves fit then the Cholesky model may be inadequate to explain the cross-twin covariances.