Model Keeps Running Forever (fails to terminate)

8 replies [Last post]
kzhang's picture
Offline
Joined: 02/29/2012

Hello,

First, thanks for providing such a useful software package!

Second, I am having some trouble getting my model to finish running, and since I have to manually stop the script, I don't get any helpful output that would help me debug. Please find attached my script.

A brief description of what I'm trying to do, since some of the code is tricky (since I need many workarounds to deal with my data and model). I am feeding OpenMx a meta-analyzed correlation matrix (O), with each observed correlation being a correlation between psychological constructs at 2 time points. The objective function to be minimized is a least-squares approach. Matrix B contains 2 parameters I'm trying to estimate (a and e). I then want to take a and e, plug them into EFF, and use that to iteratively calculate the parameters of my gamma matrices. The parameters from these gamma matrices are then plugged into my PSI matrix. Both B and PSI go into the objective function to be minimized here.

I'm unsure what is causing my script to malfunction, but I wonder if the following have to do with it:
1) The circularity of my calculations. Using B to calculate PSI, and having both B and PSI go into the optimizer.
2) When I did some troubleshooting, sometimes it seemed as if the XI matrices were not being treated as matrices (but instead as some other type of object or data frame).
3) During troubleshooting, I also sometimes saw the error: Attempted to set improper value (1,3) to (3,2) matrix. What does this mean?

Thank you very much!

Best regards,

Anne

AttachmentSize
MatSpec022912full.R32.86 KB
mspiegel's picture
Offline
Joined: 07/31/2009
Thank you for posting this

Thank you for posting this script to the forums! I've implemented a cache when evaluating algebras in the mxRun() implementation. Now the pre-optimization portion (the "front-end") of mxRun() finishes in under 3 seconds on this test case.

Additionally, this script uncovered a bug in the interaction of square-brackets inside labels and using the matrix transpose operator. Here's a simple test case of this bug:

A <- mxMatrix('Full', 2, 2, values = c(1,2,3,4),
     byrow = TRUE, name = 'A')
B <- mxAlgebra(A + A, name = 'B')
C <- mxMatrix('Full', 1, 2, labels = c('B[2,2]', 'B[2,1]'),
    byrow = TRUE, name = 'C')
D <- mxAlgebra(t(C), name = 'D')
model <- mxModel('model', A, B, C, D)
mxRun(model)

The square-bracket/transpose bug will be corrected in OpenMx 1.2.1 (the fix has been committed to the source code repository). The cache for evaluating algebras will not be released until the OpenMx 1.3 series. We should be creating a binary pre-release of the 1.3 series soon. Another option in the meantime is to build OpenMx from source.

kzhang's picture
Offline
Joined: 02/29/2012
Thank you so much for

Thank you so much for implementing this! And in such prompt fashion as well!

Best regards,

Anne

Steve's picture
Offline
Joined: 07/30/2009
I took a look at your script.

I took a look at your script. You are right, it is hard to read.

I did not see anything obviously wrong with the model, i.e. circularity. However, given the way you have set it up, there are _a_lot_ of computations per step of optimization. Note that gamma1 requires the calculation of gamma0. Calculation of gamma2 requires the calculation of gamma1 which requires the calculation of gamma0.

I tried stepping through using mxEval() and the problem becomes evident:

> system.time(mxEval(gamma0, model=FullModel, compute=TRUE))
   user  system elapsed 
  0.188   0.000   0.188 
> system.time(mxEval(gamma1, model=FullModel, compute=TRUE))
   user  system elapsed 
  0.502   0.001   0.503 
> system.time(mxEval(gamma2, model=FullModel, compute=TRUE))
   user  system elapsed 
  1.152   0.002   1.154 
> system.time(mxEval(gamma3, model=FullModel, compute=TRUE))
   user  system elapsed 
  2.534   0.005   2.539 
> system.time(mxEval(gamma4, model=FullModel, compute=TRUE))
   user  system elapsed 
  5.095   0.010   5.105 
> system.time(mxEval(gamma5, model=FullModel, compute=TRUE))
   user  system elapsed 
 10.415   0.017  10.432 
> system.time(mxEval(gamma6, model=FullModel, compute=TRUE))
   user  system elapsed 
 20.931   0.038  20.969 

Each gamma takes twice as long as the previous. By my calculations, gamma16 will take 0.18 * 2^16 seconds

> 0.18 * 2^16
[1] 11796.48
> (0.18 * 2^16)/360
[1] 32.768

In other words, I estimate it will take about 32 hours before you can populate your PSI matrix at the starting values. And that's just for one step of the minimization. So, if your computer can be reliably on for a few weeks, you could fit the model the way you have it written.

But, perhaps if we had a better understanding of what you are trying to do we could help design a more efficient way of estimating what you want to know.

Could you give us some guidance here?

It looks like maybe you are attempting to fit a longitudinal multiple group model with 16 groups, while constraining parameters across groups. Is that true? If so, there is a much simpler way to do this. If not, a more detailed explanation of what you want from the model might be helpful.

mspiegel's picture
Offline
Joined: 07/31/2009
I think it is a good idea to

I think it is a good idea to continue the discussion as to whether the script specifies the intended model. My apologies, I don't have any intelligent contributions on that topic. But on a related topic, I may be able to speed-up the initial computation in the front-end. The mxEval() implementation doesn't store any intermediate calculations, which is probably what is causing the exponential performance cost. I should be able to add a cache to mxEval(), which will store the result of intermediate MxMatrix or MxAlgebra calculations. And the current script (again I don't know whether it works as intended) is a perfect test case!

kzhang's picture
Offline
Joined: 02/29/2012
Thank you so much for your

Thank you so much for your reply! Your insight regarding the inner workings of OpenMx was very helpful.

The correlations in my observed matrix (O) are correlations between (behavior problems) at different time points. I have 18 time points. I'm trying to model the underlying patterns of change and stability within this data set.

The B matrix is set up with dimensions representing:
C, P1, E1, P2, E2, P3, E3, ... , P18, E18
- C is the latent "constant" factor
- P(i) is the latent "behavior problems" factor at time i
- E(i) is the latent "environment" factor at time i
And the direction of causality is vertical, such that the e at position B[4,1] represents the influence of the constant factor on behavior problems at time 2. The a at position B[4,2] represents the influence of behavior problems at time 1 on behavior problems at time 2. And so on...

The PSI matrix is designed to assess the variances of latent variables C, P1, E1, P2, E2,... and so on. The trick here is that I'd like to calculate these variances iteratively from estimated e and a. And that's what all the gamma(i)s, XI(i)s, rho(i)s are designed to do, is calculate these iteratively. So while I'd like to get free estimates of e and a, I don't want these variances in the PSI matrix to be estimated freely. Is there a simpler way to do this that wouldn't take so long?

Thank you!

Anne

kzhang's picture
Offline
Joined: 02/29/2012
It might also be helpful for

It might also be helpful for me to attach the original analyses (by Fraley & Roberts; please ignore my highlighting) that I'm trying to emulate.

Again, I really appreciate your input and help!

Anne

AttachmentSize
Fraley & Roberts - meta-analysis on longitudinal stuff.pdf 241.61 KB
Steve's picture
Offline
Joined: 07/30/2009
I will ask a couple of very

I will ask a couple of very basic questions, because I still am unclear about exactly what you want from your analysis.

1. Are you attempting to fit a cross-lagged autoregression model as in Figure 2 of the referenced paper (but with 18 occasions of measurements)?

2. Are you familiar with the path model specification methods used in OpenMx? This could make your life a lot easier!

Also, (WARNING: shamelessly self-serving plug), you might want to look at my chapter on dynamical systems models in the new APA Handbook of Methods published this month. Since the early 90's, I've been working on the problems that Fraley and Roberts describe. You might also look up the "latent difference score" approach by McArdle and colleagues. Surprising that Fraley and Roberts weren't aware of these approaches to estimating stability/dynamics versus reliability.

If you draw a path diagram of the model you want to specify, along with some notes, I'd be happy to help you get this written in OpenMx.

Steve's picture
Offline
Joined: 07/30/2009
A least squares model based on the Fraley&Roberts appendix

I assumed that your covariance matrix was organized as P1, E1, P2, E2, .... etc.

Given that, here is an R script that creates a model using path specifications and then modifies it to do least squares filtering out the missing elements in the observed covariance matrix.

Hopefully this is helpful to you.

AttachmentSize
CrosslagAutocorr.R 4.82 KB