Longitudinal CFA with multicollinearity between latent factors across time

6 replies [Last post]
dadrivr's picture
Offline
Joined: 01/19/2010

I am trying to fit a longitudinal CFA with 3 indicators at each of 4 time points, with a 1 year time lag. The model runs fine when I have 3 time points, but the model fails when I add the fourth time point. It appears that the model may fail because of multicollinearity among the latent factors (the correlation between the latent factors at T3 and T4 = .996). I have already specified the within-indicator residual covariances across time, but it does not solve the problem. Here are the correlations of the 4 latent vars:

lag T=1: .988, .990, .996
lag T=2: .976, .967
lag T=3: .937

Any ideas for how to specify the longitudinal CFA given the high correlation among the latent variables across time? The 3 indicators represent questionnaires by three raters: mothers, teachers, and fathers.

Thanks in advance!

neale's picture
Offline
Joined: 07/31/2009
Fewer factors?

Echo what mhunter said by defining fails as explicitly as possible. At a glance I would be inclined to reduce the number of factors - one or two would seem like all you need.

mhunter's picture
Offline
Joined: 07/31/2009
Could you describe how the

Could you describe how the model fails specifically? Does it error in the middle of estimation? Or only give a warning at the end? Etc.

Thanks!
Mike

dadrivr's picture
Offline
Joined: 01/19/2010
The model estimation

The model estimation terminates normally, but there is a warning at the end, presumably because of the linear dependency of the latent variables. Here's the error from Mplus:

WARNING: THE LATENT VARIABLE COVARIANCE MATRIX (PSI) IS NOT POSITIVE
DEFINITE. THIS COULD INDICATE A NEGATIVE VARIANCE/RESIDUAL VARIANCE FOR A
LATENT VARIABLE, A CORRELATION GREATER OR EQUAL TO ONE BETWEEN TWO LATENT
VARIABLES, OR A LINEAR DEPENDENCY AMONG MORE THAN TWO LATENT VARIABLES.
CHECK THE TECH4 OUTPUT FOR MORE INFORMATION.
PROBLEM INVOLVING VARIABLE EXT4.

I'm less interested in combining across years because we are interested in the year-to-year change of the outcome. I'm surprised the latent variable correlations are so high. The within-time correlations of the indicators range from .23 to .67, and the cross-time correlations of the indicators range from .52 to .69.

neale's picture
Offline
Joined: 07/31/2009
OpenMx?

Was there an error when fitting the model with OpenMx? I'm less interested in offering tech support for Mplus :)

Ryne's picture
Offline
Joined: 07/31/2009
He posted to the OpenSEM

He posted to the OpenSEM forums, not the specifically OpenMx section.

mhunter is right. The correlation matrix between your latent variables (PSI in LISREL/Mplus speak) is non invertible due to the very high correlations between the latent variables. As this is an Mplus error, I don't know if this happened at the end of optimization or during the middle. If it's at the end, you're kinda screwed, but if it's in the middle, this is something that OpenMx has some advantages over Mplus in handling.

Were you using OpenMx, you may be able to correct this problem with bounds, creative model respecification or just extant differences between the programs. In the course of optimization, whatever software you use will try different values for every parameter, and ideally try values on either side of the correct value. If the correct value for the time3-time4 correlation is .996, your optimizer is very likely to try something above it, and there isn't a lot of room between .996 and 1 to sample from. Very likely, Mplus tried a value for this correlation that is above 1 or otherwise led to the error.

OpenMx doesn't care about the positive definiteness of the PSI matrix when using either user-specified or RAM models: all that has to be positive definite is the final data covariance matrix. Mplus and other LISREL-like models invert various model submatrices, whereas OpenMx doesn't. Your model might run better in OpenMx. Alternatively, if you run into the same problems in OpenMx for related reasons, you can always add bounds to your parameters. Set the ubound (upper boundary) for the time3-time4 correlation to .999 or something like that, and the optimizer won't try those invalid values that lead to your problem.

Finally, if nothing else works, you could always try to respecify your model in a different way, but we can get to that if nothing else works.

Good luck!

dadrivr's picture
Offline
Joined: 01/19/2010
Thanks Ryne! Your

Thanks Ryne! Your explanation and suggestions are really helpful. I'll try some of them, and we'll see how it goes.