the number of observed statistics in the mixture models

5 replies [Last post]
camelia's picture
Offline
Joined: 12/26/2010

Hi,
I am fitting a mixture model for a genetic association test that includes sibships of variable sizes, with observed and missing genotypes; the model specification is based on the Acemix2.R script.
I would like to know if it is possible to obtain the correct number of observed statistics without being necessary to specify it manually in the summary() function.
Also, the mixture models for 4 sibs are very slow (the sibship=4 mixture is a 27 component mixture). Could you give me some suggestions on how to optimize the code for running the scripts faster?
Thanks,
camelia

camelia's picture
Offline
Joined: 12/26/2010
your suggestions were very

your suggestions were very helpful, thank you!
camelia

neale's picture
Offline
Joined: 07/31/2009
Camelia You make a good point

Camelia

You make a good point - openMx should automatically detect that the same data are being used in different mxModel() mxData() commands, but only score the observed statistics ONCE. Something for Michael S to look at, methinks.

As far as making them run faster, you could try turning off standard errors and Hessian calculation with an mxOption

model <- mxOption(model, "Standard Errors", "No")
model <- mxOption(model, "Calculate Hessian", "No")

I would have thought the Hessian sufficient, as Standard Errors need the Hessian...

Of course if you want the errors, this won't help.

mspiegel's picture
Offline
Joined: 07/31/2009
Oh, that's not a bug it's a

Oh, that's not a bug it's a feature. I didn't notice that part of the script. The correct solution is to add the mxData() statement to the outer "twinACE" model, and delete the mxData() statements in the MZ and DZ submodels. Data trickles down from parent to child, when the child model has not specified a data set.

neale's picture
Offline
Joined: 07/31/2009
Wow, cool, I didn't know

Wow, cool, I didn't know that! No doubt it will speed things up a bit not to have scads of copies of the same data.

mspiegel's picture
Offline
Joined: 07/31/2009
With regards to improving

With regards to improving performance on the Acemix2.R script: OpenMx (currently) does not perform common subexpression elimination. Which means if you see the same subexpression in multiple algebras, then that subexpression is calculated multiple times. You should pull out common subexpressions into separate algebras. Remember than "%*%" binds tighter than "+", so pull out the "%*%" first. Also, if the script is taking a long time to run make sure to turn on checkpointing. Here are some common subexpressions in that script:

  1. 0.5 %*% A
  2. A + C
  3. [1] + C
  4. [2] + E