Thu, 01/21/2010 - 11:38

Suggestion: Add an option to mxPath to accept an MxMatrix object.

E.g.,

VApsn <- mxMatrix(name="VApsn", type="Symm", nrow=5, ncol=5, free=TRUE)

hVApsn <- mxAlgebra(.5 %x% VApsn, name="hVApsn")

mxPath(from=AMZ1, to=AMZ2, arrows=2, matrix=VApsn)

mxPath(from=ADZ1, to=ADZ2, arrows=2, matrix=hVApsn)

Rationale: Imagine a BG twin model with 5 personality traits and 5 types of psychopathology and a complicated causal path model among the phenotypes. With the current OpenMx, the easiest way to code this is through the matrix approach, but working out the algebra and implementing it is error prone. An MxPath approach eliminates algebraic error, but is a real pain in the gazuggie to code, is very susceptible to error, and virtually impossible to grasp by examining the MxPath object. Allowing an MxMatrix object (easy to inspect) to interface with an MxPath object greatly reduces the probability of error and makes it much easier for the user.

greg

Thanks, all, for the productive discussion. Mike is entirely correct--my motivation was for multivariate models in genetic epidemiology where the pedigree structure can be complicated. Here, the same matrix block (including functions of that block) gets repeated over and over in the A or S matrix for each pair of relatives in the pedigree. Although it is inefficient from the optimizer's perspective, the best technique from a user's perspective is to use a RAM approach and insert the appropriate matrix into the block of, say, A that corresponds to the paths between relative i's variables and relative j's variables. Perhaps, we genetic epidemiologists should follow Tim's suggestion and develop a version of OpenMx with functions that make it easier to use for our problems. As I said before--and as Tim's hack verifies--the coding is not difficult.

There is, however, one last, nagging question. Take matrix VA (additive genetic covariance matrix)

VA <- mxMatrix(name-"VA", type="Symm", …)

In the pedigree approach, we need to insert VA in some blocks of S, .5*VA in others (covariance between the As of first degree relatives), .25*VA in others (second degree relatives) and so on. Can you show me how to do this without having the optimizer choke on constraints?

greg

Three points:

1) I think it's pretty likely that each of the subdisciplines within the OpenMx community will each build up a library of helper functions that are specific to the types of problems that they commonly encounter. That's part of the plan--we're building the core of OpenMx to be as general as possible, and letting people implement the helpers for their specific issues on their own. That's also part of why we decided to use R for OpenMx, and why we hope this wiki and forum system will be helpful--it will let people maintain and share those functions.

2) I've been thinking some more about your original request--I think I got distracted from the real point of it by all the matrix/path/matrix conversions. It seems like what you really want is an easier way to construct block MxMatrices and to access them blockwise. Is that right? If so, I think that might be a general enough function to be worth integrating into the core code.

In R, you can access a block of a matrix using selectors:

will return a matrix containing rows 1,2,3,5,and 21, and columns with dimnames A, B, and F. These can be assigned to a matrix value using:

where newMatrix is a 3x3 matrix.

You can currently assign elements to MxMatrices using syntax like:

But that only populates the values, doesn't constrain equality, and goes around the OpenMx accessor functions, which we'd like to avoid. I'd recommend keeping the []<- sort of syntax, but maybe altering it to make it more appropriate to the multilayer nature of MxMatrix. It's not quite clear to me yet how this should look exactly, but let's continue the discussion and see if we can come up with a clean way to make it work.

Any thoughts from the community on whether this is worthwhile? On how this should look? Any thoughts from the developers on how tricky it would be to implement?

3) On your specific question about constraints: thanks to Michael Spiegel's good work, we can index into MxAlgebras the same way we index into MxMatrices. That means that we can have an algebra calculate the values we want, and insert them the same way we did the raw values.

If you look at my example above, I calculate the .5xVA matrix, and use the same function as I did for matrices.

Behind the scenes, constrainPathsFromMatrix() labels the elements of the S matrix with "hVApsn[i,j]", where i and j are the appropriate row and column of hVApsn. This could be done with any algebra, so you could repeat it with a .25 to get qVApsn and do the same.

In the back-end, VApsn will be computed, then hVApsn will be computed from it, and then the S matrix will be populated with the appropriate values from the newly computed hVApsn. Only VApsn has any free parameters--the rest are computed values calculated from those. This way the number of free parameters is small, the optimizer itself doesn't have to worry about constraints between free parameters, and the constraints can still be maintained.

many thanks, guys!

i'll get on to writing some functions that will make OpenMx more user friendly for the genetic epidemiology community.

is there any naming convention that i should adhere to? e.g., should i substitute "gePath" (genetic epidemiology path) for mxPath? haven't gotten to the chapter in the R book on creating packages, but are there naming conventions for packages that use OpenMx?

best,

greg

Sounds like a good idea to prefix the functions you will export with a common string such as "ge". In the OpenMx package all entry-level user functions have the prefix "mx" and advanced-level user functions have the prefix "omx". I have not tested this hypothesis, but I believe once your package uses Namespaces then the R runtime will detect conflicts in the namespaces of two packages. A good technical reference on creating packages in R is the Writing R Extensions manual. But it reads like a stereo manual.

When the OpenMx project is ready to leave the beta stage, we will place the package on CRAN. At that point it will become easy to add new packages to CRAN that have OpenMx listed as a dependency.

I'm not entirely positive, but it looks like you are specifying the 'S' matrix for a RAM style model. Could you accomplish this particular task by writing your own S matrix and using a RAM objective function? I've included an example below. Keep in mind that I believe that submodel1 needs some non-zero values in the S matrix, but I wasn't sure which paths should be turned on.

I will second both mneale's and tbrick's earlier comments. It would be interesting to combine the matrix and path style notations. As Tim pointed out, it is unusual because the matrix "VApsn" should not be included in the eventual model. Otherwise you'd get all the free parameters from "VApsn" and all the free parameters from the "S" matrix. But the algebra "hVApsn" needs to be declared in the model, because we intend to use it in the model.

Interesting, Greg, but I'm not sure I appreciate all of the functionality you're suggesting. The virtue of this approach seems to be the ability to affect free, value and label statements of all of the paths. You can achieve this functionality by referring to the three matrices inside an mxMatrix object, like I've done below:

VApsn <- mxMatrix(...)

path <- mxPath(from=AMZ1, to=AMZ2, arrows=2,

values=VApsn$values,

free=VApsn$free,

label=VApsn$label)

Additionally, mxPath necessitates RAM specification. A model that includes MxPath objects will build the A, S and F matrices, all of which may be viewed like so:

model$A

thanks, ryne

(1) i get a errors with the code (e.g., The 'values' … should be a numeric vector.) i think it should read

path <- mxPath(from=AMZ1, to=AMZ2, arrows=2, all=TRUE, values=as.vector(VApsn$values),

free=as.vector(VApsn$free),

label=as.vector(VApsn$label),

lbound=as.vector(VA$lbound),

ubound=as.vector(VA$ubound))

(2) mxPath(from=AMZ1, to=AMZ2, arrows=2, matrix=VA) is easier for the user and requires hardly any code changes in mxPath. in general, i think that OpenMx would benefit by taking human factors research seriously and focus strongly on making it as easy as possible for the user.

(3) problem using with MxAlgebra object

(4) with 10 phenotypes and, say, MZ twins, there are 20 total phenotypes, 20 additive genotypes, and 20 environmental variables (or 40 if using the C and E model) giving 60 by 60 A and S matrices (or 80 x 80 matrices). i certainly have trouble trying to view and absorb matrices of this size.

greg

Yes, you're right. You do need to turn those matrices into vectors. My apologies.

I suppose my issue is that I don't quite get the functionality you're after. I am all for making the code as parsimonious and user-accessible as possible, but I don't know that adding additional arguments to the core functions is a good way to do that. Your example creates an additional MxMatrix object that isn't included in the covariance algebra, but has to be put in the model. As long as you're doing that, a simpler way to do your model is just to label the paths with the same labels as the matrices.

I'm also not sold that the matrix is necessarily easier. For the matrix to work as input for the path statement, then the mxPath statement should just recreate the existing matrix. To put it another way, mxPath just creates matrices or parts of matrices. You're proposing a matrix that makes paths that makes matrices, and I think that for the matrix->path part to work, the first matrix must be identical to or a subset of the final matrix. In your example, you create a symmetric matrix, then use the mxPath statement to create the same symmetric matrix somewhere in the final model. There are easier ways than that.

On your last point, don't forget that this beta program is changing. A GUI for drawing paths and a lisrel type are future plans, which will make model specification easier for new users and allow for mxPath to make many small matrices, respectively. The model you described will always be big, but there are a variety of ways to make it simpler to look at.

I agree with Greg, this type of syntax would be very helpful to the user. How much it requires in the way of code changes I am not sure. In essence, large chunks of the A and S matrices which are used in a ram specification model would be filled with the contents of other matrices pre-specified by the user.

Since R is pass by value, the VAspn etc matrices in question would not automatically be updated unless they were declared inside the same mxModel. That may be ok.

Seems like a worthwhile modification in terms of user convenience not to have to equate values, free, label, lbound, ubound individually.

I don't think we should forget about the distinction between MxMatrices and plain old matrices in R. The code Greg gave only constrains the matrix of paths created to a symmetric structure. At the moment, you can only make those constraints either by giving those parameters the same label or by using the mxConstraint function. Admittedly, these solutions will take longer code than Greg's proposal, but do so in a way that the equality constraints are explicit, which is very very worth it.

This vector of labels should be 25 labels long, consistent with the 5 by 5 matrix Greg gave. I'm assuming that element 2,1 is equal to element 1,2, and so on. Does anyone see an easier way?

myLabels <- matrix(NA, 5, 5)

diag(myLabels) <- c("c11", "c22", "c33", "c44", "c55")

myLabels[lower.tri(myLabels)] <- c("c12", "c13", "c14", "c15",

"c23", "c24", "c25",

"c34", "c35",

"c45")

myLabels[upper.tri(myLabels)] <- t(myLabels)[upper.tri(myLabels)]

myLabels <- as.vector(myLabels)

mxPath(from=AMZ1, to=AMZ2, arrows=2, all=TRUE,

free=TRUE,

labels=myLabels)

I see two primary issues being raised, which I think are separate.

1) It is difficult to inspect a path model.

This is something that is in process. Suggestions on how we might better be able to display a complex path model so that it's easy to view and absorb are certainly welcome.

If I'm reading this right, the other one is:

2) It should be possible to quickly and easily constrain large chunks of the matrices in the covariance algebra to be equivalent to the corresponding elements of a provided matrix.

I don't know if this is a good fit for the core functionality. There are certainly ways that this could be used, but I'm not sure it's general enough to warrant the additional overhead in the RAM type model. Presumably there would be an equivalent version needed for LISREL models when that model is completed, and it might become cumbersome. Also, the current specification uses an MxMatrix outside of an MxModel, which makes the meanings of things less clear.

I'd suggest directly altering the A and S matrices using R's accessor functions as a "best practices" way of doing this.

That said, one of the nice things about OpenMx is that helper functions are easy to write. So here's a quick hack of one that should work. This version uses the MxPath function directly, so it should continue to work using a LISREL model instead of a RAM model, once LISREL path models are implemented. I should add that it's not clear that this is the best way to do it. (I also use for loops instead of apply functions, which will slow things down somewhat.)

usage example from above:

Greg is essentially promoting (wishing for) an integration of the Matrix and Path syntaxes. In effect, this would be very useful for multivariate genetic epidemiological models. It would make the work of Vogler (1984) http://www3.interscience.wiley.com/journal/110486312/abstract?CRETRY=1&S...

easy to implement directly within OpenMx. In turn, this could make such models much more widely used and ease their further development.