Expected covariance matrix is non-positive-definite.

6 replies [Last post]
kspoon's picture
Offline
Joined: 06/17/2011

I know there's another similar active thread, but I didn't want to highjack it. Most of the threads I've seen in regards to this error message have some reference to an iteration or line - mine does not. What's in the subject is the entire error.

I'm attempting to model a 'saturated' model for pathway analysis - working with a twin data set. A true saturated model shows all our path diagrams to be a much more horrible fit to the data (p-values < 10^-10), but I believe I'm getting penalized since we have constraints that the pathways are the same for all 4 groups - within (A and B) and across (MZ and DZ). So, my goal was to make a 'saturated' model instead, constraining the necessary parts within the covariance matrix and across the covariance matrices the same.

I've spent a lot of time playing with starting values - choosing some data driven values (which required some massaging to be positive definite to start) and some rather dumb values (off diagonal elements to .5 and diagonal elements to 1).

Is this truly just a starting value problem? Or are my constraints causing the issue? Any advice would be appreciated. I've attached faked correlation matrices for both groups and my code - which I admit is probably horrible to look at and try to parse through. I feel like my error is likely conceptual and completely based on some horrible misunderstanding on my part.

Thanks!

AttachmentSize
ExpCovPOSDEF.R6.6 KB
DZcor.csv2.56 KB
MZcor.csv2.58 KB
kspoon's picture
Offline
Joined: 06/17/2011
General Structural Equation Question

So, the more I've looked at this problem, the more uncertain I've become in my decisions. This started as a project a year ago and was my first introduction to structural equation modeling - my advisors at school and my coworkers here don't have much experience either. Since then, I've dealt mostly with twin models.

How would you work twinness into a pathway analysis with some family-level and some individual-level variables?

As you can see my original thought was to account for the MZ and DZ groups separately, and to model paths between each variable in the A twin with the corresponding variable with the B twin - and to allow that to differ between the groups (MZ and DZ) while the within twin paths were constrained to be the same for all individuals. However, we are only interested within-individual paths, so I shouldn't need to even add those correlations as parameters - right?

I still feel like the two group design is helpful since the correlations across twins differ between MZs and DZs, but is it really necessary? And I'm correct to run everything pairwise because of the family-level variables, right? Otherwise, I could split it up into 4 groups MZ-A, MZ-B, DZ-A, and DZ-B, but I also kill my degrees of freedom doing that...

Thanks!

Ryne's picture
Offline
Joined: 07/31/2009
This is a starting values

This is a starting values problem, and I think it's caused by the upperTriangle and lowerTriangle functions that aren't defined in your code. When defining multiple paths at once, you have to consider the order. If you check the mxPath help page, you'll see that the definition for 'unique.pairs' is like so:

(a,a), (a,b), (a,c), (b,b), (b,c), (c,c)

That means, if you give a list of A1-A5 with A1-A5, the first five paths put in are A1 with itself, A1 with A2, etc. Then the next four are A2 with itself and then with A3-A5. If you're thinking of the covariance matrix of A1-A5, 'unique.pairs' first does the first column top to bottom, then the second column from the second row to the bottom, and so on. In short, mxPath reads the lower triangular matrix by column (column major).

The lower.tri and upper.tri functions that lowerTriangle and upperTriangle must call are also column major. However, column major operations on the transposed matrices you put into these functions are effectively row major. If you look at the withinS set of starting values, you'll notice that it starts c(1, xxx, 1...). That means you're assigning the correlation between A1 and A3 to 1, which I presume you meant for the A2 variance. Stop using the t() function and your problems should go away. Your starting values look OK; you just have to put them in the right places.

kspoon's picture
Offline
Joined: 06/17/2011
Not a transpose issue, I promise. :)

That may not have been the final iteration of my code - I went through quite a lot yesterday, so my apologies if the one I provided isn't filling the starting values properly. I don't believe it's an issue of transposing the matrix. I checked my matrix of starting values before trying to run the code and the 1's were along the diagonal.

I did iterations where the starting values were identical to the original correlation matrix - with the constraints that those paths that had the same label had the same starting point. This resulted in a non-positive definite matrix for the starting values - regardless of if I started with the MZ correlations, DZ correlations or an average. Which is why I had the line for adding in a constant along the diagonal - but not for the first element which was the fixed term (not sure if I still need the fixed term for the saturated model or if it even matters since I've told OpenMX that I'm feeding it correlation matrices.

Any other ideas to overcome this error code?

Ryne's picture
Offline
Joined: 07/31/2009
I looked into this a little

I looked into this a little more, and this is definitely a starting values problem. If we run your mxRun command with the 'unsafe=TRUE' option, you'll see that you're getting the error on the first iteration. As all of your parameters are in the S matrix (i.e., they're all two-headed arrows with no single-headed arrows/regression parameters), we can just view your S matrix to see if it's invertible. If you run the eigen() function on your set of starting values, then you'll see that the final 4 eigenvalues are negative, which indicates that your matrix of starting values is not positive definite. The first 8 eigenvalues explain more than 100% of your covariance matrix.

We can make incredibly simple starting values (variances=1, covariances=0) like so:
ModelSAT$MZ$S@values <- diag(12)
ModelSAT$DZ$S@values <- diag(12)

When I do so, the error goes away and I get a converged solution (-2LL = 5547.74). As the whole thing only took 1.4 seconds with those lousy starting values, I wouldn't worry that much about find better ones.

tbates's picture
Offline
Joined: 07/31/2009
let's check this into the error message

be great for openmx to return this simple advice in the "not pos def" error message

kspoon's picture
Offline
Joined: 06/17/2011
Uuuuuugh.

Thank you, Ryne. You're completely right. I managed to get my saturated* model to run yesterday by just starting with the 'dumbest' start values possible. I've had so many issues with wonky estimates based on start values that were not nearly as silly, so I didn't even consider just using the identity.

I know that in my 4 hours of struggles, I made sure to check for positive definiteness by doing eigen(ModelSAT$MZ$S@values), but was still getting the error. Who knows. At this point, I'm just immensely happy it's been resolved.