Superior syntax for the R interface?

3 replies [Last post]
neale's picture
Offline
Joined: 07/31/2009

Hi folks, I am just pasting in here some email discussion between Greg Carey and myself concerning the language interface. In the thread we drifted off to talk about models of assortative mating.

Greg Carey's email:

mike,
just looked at openMx and am worried that things might get unnecessarily complicated when dealing with large multivariate problems.

imagine data on twins and families on the WAIS and a factor model. the proposed GUI is like Amos--you will spend so much time drawing paths from 13 variables to X latent factors for father (including 13 specifics), X latent factors for mother (13 more specifics), for son 1, son 2 ... to daughter k that the actual canvas cannot accommodate the number of necessary symbols. in addition, the 13 x 13 covariance matrix between husband's WAIS and wife's WAIS should be predicted by the model of marital assortment. try to express this in jack's RAM notation--it is possible but the only practical way (that i can see) is to define blocks of the path-coefficient matrix and the exogenous-residual covariance matrix in terms of other, more specific matrices.

anyway, what everyone is doing is perfectly fine--for simple models. for complicated models, i suspect that a different interface is required (one that may, in fact, be accommodated by expansion of the python script, mxParser.py). in terms of a positive contribution, i suggest that the old logic of Mx be followed but in the following way:

(1) define the parameter matrices containing free and/or fixed parameters [e.g., what is typically done in Group 1]
e.g.,
begin matrices;
covA sy 3 fr
D di 3 fi
StdE di 3
WhatEver fu 3 6
end matrices;

(2) parse a series of statements that fix, free, equate, etc. elements of the above matrices. these would include the FI, FR, EQ, PA, MA, SP statements of Mx
e.g.,
begin Whatever_You_Want_to_Call_This_Section;
PA StdE
0 1 1
MA StdE
23.6 4 9.3
MA WhatEver
.3 .8 .7 0 0 0
0 .7 .3 .2 0 0
0 0 0 0 .8 .5
FI Whatever 1 1 2 1 3 2
end Whatever_You_Want_to_Call_This_Section;

(3) give the algebra (as in the first group in usual Mx Code) but in R syntax
begin algebra
Example = Stde %*% WhatEver %*% transpose(Whatever)
end algebra

(4) assign the predicted matrices from (3) to the groups in the model
begin PredictGroups
1 CPre1
2 hbind(cbind(Vp,covDZ),cbind(covDZ,Vp))
end PredictGroups

(5) from the above information, it is possible to construct two different numeric vectors:
xfandf = vector of all free and fixed parameters
xf = compact vector of free parameters,
so do the following in the minimization function passed into NAG E04???
(5.a) take the vector xf passed from E04??? and put the appropriate elements into vector xfandf
(5.b) from vector xfandf, construct the parameter matrices specified in BEGIN MATRICES and BEGIN WHATEVER
(5.c) perform the algebra specified in step (3) above

----- CHECK THIS CODE OUT ----------------------------------------------
npheno <- 5
begin matrices
Va sy npheno
sigE di npheno
Re st npheno
eta sy npheno
end matrices

begin matrixStuff
PA eta
1 1 1 0 0
1 1 1 0 0
1 1 1 0 0
0 0 0 1 1
0 0 0 1 1
ST 2.5 DIAG(Va)
ST 0.4 OFFDIAG(Va)
ST 3.0 sigE
end matrixStuff

begin algebra
Vp <- Va + sigE %*% Re %*% sigE
covE <- sigE %*% eta %*% sigE
end algebra

begin predictGroups
1 rbind (cbind(Vp, Va + covE), cbind(Va + covE, Vp))
2 rbind (cbind(Vp, .5*Va + covE), cbind(.5*Va + covE, Vp))
end predictGroups
----- END OF CODE --------------------------------------------------------

mike,
bottom line = if you implement this in the existing openMx framework, it will take pages of code. a suitable preprocessor in python, perl, or (if really interested in a potential gui with web interfaces) java can have a user enter the code and have it translated into Mx equivalent using the above rules.

anyway, just trying to help out.
greg

Editorial Note from Mike Neale: the Pattern command, which is rarely used in Mx, simply identifies which elements should be free parameters without actually assigning parameter numbers (labels in OpenMx) to them. It is equivalent to free=c(T,F,T) syntax in OpenMx.

tbates's picture
Offline
Joined: 07/31/2009
Indeed: many complex models

Indeed: many complex models have lots of paths, and the pathic-interface can't handle some of the kinds of syntax which would express these compactly.

It does have some powerful capabilities already, of course: one line can easily build, for instance 13 paths from a latent to WAIS manifests.

It would be good to have some additional syntax to say "and do this for the same-named variables in each of the groups you find in the supermodel" and some other commands that you would give to a clever automaton with a large drawing board but not much clue about what your scientific question is.

Matrices are great way of hiding this complexity: nothing like a 5*5 matrix and a couple of 1*1s to make a 100 path monster seem tractable.

Saying that, it would be handy to be able to add and drop paths from matrix models using path syntax, and it seems that this won't be too hard to code (and then share) in R for each of the common matrix metaphors we use.

Solving this for a two group Cholesky might be a nice place to start: it just hides the matrix setting functions (as do the pathic functions, but would make scripts more readable.

neale's picture
Offline
Joined: 07/31/2009
Greg Yes, I have been trying

Greg

Yes, I have been trying to maintain and retain more compact coding. Note first that OpenMx has retained full matrix algebra specification of models. Second, the path language permits 'drawing' all the paths from one set of variables to another (default is element 1 list 1 - element 1 list 2, element 2 list 1 = element 2 list 2, i.e. diagonal but can be changed with all=TRUE for a full Cartesian set). Third, the path language can be wrapped inside regular R loops and so forth in order to set up more complex patterning of paths within a matrix.

I suspect, however, that you really want a version of George Vogler's multivariate path analysis. I have been trying to effect this with constraints, such that a subset of matrix elements can be the result of a formula. I'm not sure exactly where we are with that. For a while at least, only scalar elements could be formulated.

The other approach I have been wanting to implement, drawing-wise, was to have an "arrows=0" option which would effectively draw co-paths. The subsequent algebra is not that bad - Pearson Aitken as usual. However, the situation gets more problematic if there are multiple copaths and their order is significant spouses of twins is an example; with a single application of the formula I think the spouses remain uncorrelated. If one applies first one copath and then the second, then the spouses end up correlated. This latter approach seems more reasonable because in the case of high correlations between spouses, and highly correlated twins, it would not be positive definite to have no spouse-spouse correlation.

Personally, I would rather have an R function than a Python script (see thread http://openmx.psyc.virginia.edu/thread/214) to convert Mx formulae to OpenMx formulae. But then beggars can't be choosers!

Jeff's picture
Offline
Joined: 07/31/2009
> Personally, I would rather

> Personally, I would rather have an R function than a Python script
> (see thread http://openmx.psyc.virginia.edu/thread/214) to convert
> Mx formulae to OpenMx formulae. But then beggars can't be choosers!

I agree. You'll have to ask Michael, but I think doing it using R (or with a C back-end) would be difficult at best. If someone wanted, however, they could wrap up the python with:

http://www.omegahat.org/RSPython/