Sat, 03/27/2010 - 04:27

So you run a model and get the message that your covariance matrix is not positive definite.

Why isn't the optimiser algorithm set up to avoid areas of the solution space where the eigenvalues are negative?

The optimizer will do its best to avoid these regions when it can, but it's not always possible to tell.

Because we allow people to specify arbitrary algebras, we can't really make guesses about the parameter space, and even in standard analysis cases, strange things can happen when the parameters are far from the minimum.

Often the error happens because of a user specification--for example, the error commonly comes up when users start all their free parameters at the same number. We can warn the user about this, but OpenMx tries not to make changes to things like starting values without the user's consent.

I think early detection is unlikely to be a viable solution, but it might be possible to add some recovery routines to handle this case when it happens however. For example, it might be possible for the user to request that OpenMx try different starting values if the first ones don't work, for example, or to jitter the parameters when a non-positive-definite point is found to see if nearby regions are viable.

At the end of the day, often the best thing to do is to reparameterize the model so that the optimizer cannot try parameter estimates which generate non-positive definite covariance matrices. The Cholesky decomposition is a popular reparameterization for this purpose. The disadvantage is that one doesn't directly get standard errors for the quantities of interest. Likelihood-based or bootstrap CI's can be useful here.

Nevertheless, as Tim says, good starting values (try getting the means & variances right, and the covariances in the right ball park, or near zero) can often keep the optimizer from straying into non-positive definite regions.

I guess what I still don't understand is why the optimiser is not designed so that it first tests whether an estimate in positive definite and only if the answer is yes does it check whether the associated -2LL is lower. As far as I understand it, it only uses likelihood in deciding where to go next. But it knows when things are not positive definite, so why can't it be designed to check this out and avoid those areas?!

Still a bit puzzled as to how optimisation works, I guess.

Unfortunately we have only limited control over the behavior of the optimizer as we are not using one that we developed ourselves (that may change in the future). In the early 1990's I visited the good people at Stanford to try to address this problem, with partial success. There are essentially two phases to optimization: i) figuring out the first and second derivatives of the likelihood function with respect to each of the parameters; and ii) moving to a new location based on the information gathered in i). Built into the optimizer there is a mechanism that when a ii) occurs if there is non-positive definiteness (or zero likelihood for some other reason such as outlier observations or really bad parameter estimates) then it will back out of the region to which it moved. It does this reducing the size of the step it takes towards the new best guess as to where the solution is. Several previously problematic cases ran smoothly after this change was introduced. However, two factors limit the effectiveness of this strategy. One is that during the phase i) part incomputable likelihoods may be encountered, which really stops the optimizer in its tracks because it no longer has the information needed to decide where to go next. The second is that, perhaps due to the nature of the data or the model, even a very small step taken in a phase ii) still ends up in non-positive definite (or otherwise incomputable) land. Usually the solution in such cases is to start optimization from some other location. Sometimes the model needs to be revised, and other times one may have to go and get some more data. Although this last seems rather drastic, it is perforce the case if one has more variables (columns) than subjects (rows) in a data set. Optimization would head towards a non-positive definite region in a quite determined fashion. If the model had been specified (e.g. via Cholesky) to only generate positive definite matrices then optimization would likely end up at the boundary. If it had not been specified to stay positive definite, the optimizer would likely try parameter estimates which generate non-positive definite predicted covariance matrices, and - since the log-likelihood is not computable in such instances - fail rather unhappily.