Currently, is there any documentation on running a mixed effect or nested model? I know several people who are interested, but they need some information to get started. An example perhaps? (even a very simple one)

Good idea. Right now, some mixed effects / nested models can be estimated using definition variables, but the definition variable tutorial example doesn't cover that usage. We should pass that suggestion to the documentation team.

I have some twin data where I am looking for mean differences on a continuous variable based on a categorical task where individuals perform one of two ways. Some twin pairs perform the same way, some don't. This seems really straight-forward...but I haven't been able to find a good resource on how to account for twinness on a question like this. I know a little bit about OpenMX but usually use SPSS.

By "mean differences on a continuous variable based on a categorical task" are you trying to show that there are mean differences as a function of the categorical variable? That would be very straightforward to set up with OpenMx using pairwise data input and definition variables for each of the twins which predict the means. This is almost identical to the example in the documentation http://openmx.psyc.virginia.edu/docs/OpenMx/latest/DefinitionMeans_Matri... except that data.def needs to be specified as data.def1 and data.def2 for the twins, and the model should specify different predicted means for the two twins.

However, as I am not sure whether these differences are what you seek, I won't elaborate further.

Ok, then I modified the script on the website slightly so that it would do what you want - same regression for twin 1 and twin 2, but different definition variables as the categorical indicators. It is attached. You may want to constrain the variances of twin 1 and twin 2 to be equal also (something like labels=c("var","cov","var"), in the sigma mxMatrix function call should do the trick. Also something similar in the M matrix to make the twins' means equal (apart from when they differ on their categorical indicator).

PS there is a typo in the figure in the v 0.5.0-1353 release of the documentation - the arrow should go from the triangle to the latent variable D, not the other way around.

MANY many thanks for this -- I'm excited to try it out. However, I just realized that I neglected a piece of critical information -- these are both MZ and DZ twins. To what extent is this going to complicate the code?

Not too seriously. I doubt it would make much difference to ignore this fact, but I don't know of any simulation study that shows this to be the case. Assuming that you have zygosity diagnosed accurately, a two-group approach could be used (akin to what is done for genetic modeling of data), equating all the parameters except the covariance between twins. Alternatively, a definition variable for zygosity (e.g., scored 0/1 for DZ/MZ) could be used to add a deviation to the covariance (sigma[2,1]).

If zygosity is unknown or inaccurately diagnosed, a mixture distribution could be used (see Neale M.C.: A finite mixture distribution model for data collected from twins. Twin Res 2003; 6:235-239 http://www.vipbg.vcu.edu/vipbg/neale-articles.shtml#NEALE03).

THANK YOU for these resources. I will likely leave the zygosity question alone for the time being and return to it later.

My apologies for questions on overly-simplistic questions -- I'm still learning R. I have traditionally loaded my data into R as a .csv file, and made as many changes to and adjustments to the data in its original .csv form before loading it in. For this analysis, in my file, I currently have one column defining twin1's membership in a group (0,1 -- called def1), another column with twin1's score on a continuous variable (called number), another column defining twin2's membership in the group (0,1 -- called def2), and a final column with twin2's membership on the continuous variable (called numberb). For the purposes of using the documentation you provided, how do I need to rearrange my data to make it work? I'm unclear how to define the selvars -- specifically how to put the continuous variable all in one column without changing the "one line per family" format and failing to account for the non-independence of observations.

I am having some difficulty in figuring out your question. If the data are arranged in a dataframe, with rows being twin pairs, and columns being the variables that you have described, then you are "good to go". In the example I posted previously, the 4 variables x, y, def1 and def2 were simply 4 "columns" of the dataframe. It does not matter which order these columns are in (unlike classic Mx where it does); they are tagged by their names, being the selvars and the definition variables referred to by labels data.def1 and data.def2.

Still, I think I am not understanding the question, but without clarification from others (which I'd hoped would appear over the past few days) this is the best I can manage for now.

If you have some difficulty with documentation then you have to arrange the data in a good data frame for these it can be prescribed with variable it is good to know about it.

We are currently working on extending OpenMx to the mixed-effects / dependent data-structures case. It will take some time however. I would like to hear about the type of models you had in mind.

Good idea. Right now, some mixed effects / nested models can be estimated using definition variables, but the definition variable tutorial example doesn't cover that usage. We should pass that suggestion to the documentation team.

yes, here's my +1 for documentation on how to do simple mixed effects / nested models. That would be brilliant!

Do we have an update on this? :)

I have some twin data where I am looking for mean differences on a continuous variable based on a categorical task where individuals perform one of two ways. Some twin pairs perform the same way, some don't. This seems really straight-forward...but I haven't been able to find a good resource on how to account for twinness on a question like this. I know a little bit about OpenMX but usually use SPSS.

By "mean differences on a continuous variable based on a categorical task" are you trying to show that there are mean differences as a function of the categorical variable? That would be very straightforward to set up with OpenMx using pairwise data input and definition variables for each of the twins which predict the means. This is almost identical to the example in the documentation http://openmx.psyc.virginia.edu/docs/OpenMx/latest/DefinitionMeans_Matri... except that data.def needs to be specified as data.def1 and data.def2 for the twins, and the model should specify different predicted means for the two twins.

However, as I am not sure whether these differences are what you seek, I won't elaborate further.

Yup, that's *exactly* what I'm trying to do! Sorry for the vagueness.

Ok, then I modified the script on the website slightly so that it would do what you want - same regression for twin 1 and twin 2, but different definition variables as the categorical indicators. It is attached. You may want to constrain the variances of twin 1 and twin 2 to be equal also (something like labels=c("var","cov","var"), in the sigma mxMatrix function call should do the trick. Also something similar in the M matrix to make the twins' means equal (apart from when they differ on their categorical indicator).

PS there is a typo in the figure in the v 0.5.0-1353 release of the documentation - the arrow should go from the triangle to the latent variable D, not the other way around.

MANY many thanks for this -- I'm excited to try it out. However, I just realized that I neglected a piece of critical information -- these are both MZ and DZ twins. To what extent is this going to complicate the code?

Not too seriously. I doubt it would make much difference to ignore this fact, but I don't know of any simulation study that shows this to be the case. Assuming that you have zygosity diagnosed accurately, a two-group approach could be used (akin to what is done for genetic modeling of data), equating all the parameters except the covariance between twins. Alternatively, a definition variable for zygosity (e.g., scored 0/1 for DZ/MZ) could be used to add a deviation to the covariance (sigma[2,1]).

If zygosity is unknown or inaccurately diagnosed, a mixture distribution could be used (see Neale M.C.: A finite mixture distribution model for data collected from twins. Twin Res 2003; 6:235-239 http://www.vipbg.vcu.edu/vipbg/neale-articles.shtml#NEALE03).

THANK YOU for these resources. I will likely leave the zygosity question alone for the time being and return to it later.

My apologies for questions on overly-simplistic questions -- I'm still learning R. I have traditionally loaded my data into R as a .csv file, and made as many changes to and adjustments to the data in its original .csv form before loading it in. For this analysis, in my file, I currently have one column defining twin1's membership in a group (0,1 -- called def1), another column with twin1's score on a continuous variable (called number), another column defining twin2's membership in the group (0,1 -- called def2), and a final column with twin2's membership on the continuous variable (called numberb). For the purposes of using the documentation you provided, how do I need to rearrange my data to make it work? I'm unclear how to define the selvars -- specifically how to put the continuous variable all in one column without changing the "one line per family" format and failing to account for the non-independence of observations.

I am having some difficulty in figuring out your question. If the data are arranged in a dataframe, with rows being twin pairs, and columns being the variables that you have described, then you are "good to go". In the example I posted previously, the 4 variables x, y, def1 and def2 were simply 4 "columns" of the dataframe. It does not matter which order these columns are in (unlike classic Mx where it does); they are tagged by their names, being the selvars and the definition variables referred to by labels data.def1 and data.def2.

Still, I think I am not understanding the question, but without clarification from others (which I'd hoped would appear over the past few days) this is the best I can manage for now.

If you have some difficulty with documentation then you have to arrange the data in a good data frame for these it can be prescribed with variable it is good to know about it.

Hi Pascal,

We are currently working on extending OpenMx to the mixed-effects / dependent data-structures case. It will take some time however. I would like to hear about the type of models you had in mind.

Just curious if there were updates on these types of models. I'm not thinking about anything too complicated, just including a random effect.