Tue, 12/07/2010 - 10:35

In an example, when estimating means or intercepts

(mxPath(from="one", to=c("beta0v", "beta1v"), free=TRUE, values=c(20,5), labels=c("MB0v", "MB1v"))

the summary statement produces

free parameters:

name matrix row col Estimate Std.Error

...

6 MB0v M 1 beta0v 20.33838913 0.38943505

7 MB1v M 1 beta1v 2.06270791 0.65587636

However, shouldn't rows and columns be inverted, such that the effect (one-headed arrow) coming from the column (one) goes in the target variables (beta0v, beta1v), rather than vice-versa?

I don't know if this is relevant: the reserved word "one" was selected because the means path has a weight of "1". We have hoped to introduce multiple means paths in a model with different weights. So the weight of "1" is not related to the row number in the M matrix.

I feel like this discussion has not come to an end...

To summarize, we all agree that the M matrix is a vector. For computation reasons it is specified to be 1 x m. The question is, how should M be shown in the summary?

Ryan pointed to the following discussion:

http://openmx.psyc.virginia.edu/thread/115

where Steve said (on 07/30/2009)

"Transpose in the backend or transpose in the user's mind, which is less trouble overall?"

That really sums it up.

Although at that time the 1 x m format was chosen I would like to be nostalgic and think back of the old Mx, where if one used the GUI the expectations for the vector of means and the covariance matrix, resp., were:

Means F * (I - A)~ * M * U;

Covariance F & ((I - A)~ & S);

and that inevitably requires that M be m x 1.

Again, this is not to be picky, but to have a coherent heuristic tool when teaching, in which effects always come from the columns and go in the rows (from the triangle called "one", in the column, to the manifest variables, in the rows). This same rule for specifications is used in LiSRel in the Parameter Specification portion in the output and in M+ when using the TECH1 option (not that we want to copy that...).

happy holidays,

To keep us from going in circles, lets review the two things I think we're pretty sure about.

The first is that Paolo is right; the heuristic of paths going from columns to rows doesn't hold for means, and that's a violation of path tracing rules as typically applied. If we go back to the original (McArdle 84) RAM spec, the constant is treated as just another variable, and what we discuss as the means model is the column of the RAM model that corresponds to the constant. There are lots of reasons that we went with a row matrix rather than a column matrix for means, some of them detailed in the thread listed earlier, but it does break RAM.

I'll add that means have always had different rules than covariance in RAM. If you don't place a variance term on the constant, then you have to develop an 'intercepts' rule and treat the constant as a special variable (that is, you can't use the constant as another variable consistent with GLM). If you do place the variance term there, you have to make an exception to path tracing rules that exempts all paths that involve the constant's variance term. Using path tracing rules for means/intercepts has always required some type of asterisk.

The second is that we can't just change the means vector from a row matrix to a column matrix because it would break every script our users have. OpenMx is past beta-testing and version 1.0, and one feature we're hoping to maintain throughout the full release is backwards compatability whenever possible. I don't know what other options there are; supporting both row and column versions of the means matrix or vectors would be an error- and conformability-checking nightmare, though I'll let Tim expand on that from a backend perspective if he wishes.

Regarding OpenMx as a teaching tool, I actually think this issue makes OpenMx a better resource because of the transposition. Steve's quote was "Transpose in the backend or transpose in the user's mind, which is less trouble overall?". We could transpose in the backed and not tell the user, or we could be upfront about it and make users closer to their model and optimization. While the RAM thing is a very valid point, this spec keeps true to a central OpenMx ethos: OpenMx does nothing that you don't tell it to. Not even seemingly harmless transpositions.

Here are some comments for the discussion:

Quick aside: having multiple means sources is really just a computational and visualization efficiency. We'll eventually get multiple rows for two reasons:

-having dozens or hundreds of arrows coming from one source and going to every variable in the model makes messy diagrams.

-having multiple constants would mean that paths that are the simple (unweighted) sum of free parameters and definition variables could be done without an algebra, which could be easier in some circumstances.

There's no mathematical reason to have multiple constants; whatever k x n means matrix we create will be premultiplied by a 1 x k unit matrix to make a 1 x n means matrix.

This is not a bug in summary--the RAM formulation used by OpenMx generates and uses a row matrix for expected means.

This was a topic of some discussion in developer's meetings, and comes about as a result of two primary decisions.

The first decision was to actively separate the means model from the covariance model in the RAM formulation, rather than include the mean squares as a column in the RAM covariance calculation. That is, one-headed paths from "one" go into a separate matrix M rather than into the A matrix. We did this for consistency with the ML and FIML objective functions, which allow separate (and in the case of ML, optional) matrices for means and covariance, and to make clear the distinction between the means model and the covariance model.

Once that decision was made, we had a separate expected means matrix, which could be treated as either a column matrix or a row matrix. I believe we chose to use a row matrix because a row of means seemed more intuitive. For example, data are generally displayed as a table with one column per variable. The means matrix can be thought of as a summary line underneath that table: one row, one column per variable.

As always, we welcome feedback from the community on these decisions.

I definitely see the advantages of the former decision.

But the consequence of the latter is that the general heuristic "one-headed arrows (structural weights) come from columns and point into rows" holds for the covariance model but not for the means model - despite the specifications (both path and matrix) and the .dot diagrams.

It appears to me that the current printout suggests that Sewell Wright's tracing rules are to be applied differently for means and covariances.

This is an excellent point. Before I chime in, I'll point you towards the n x 1 vs 1 x n means matrix discussion: http://openmx.psyc.virginia.edu/thread/115

The fundamental problem is that the means matrix is really a vector, and thus can be arbitrarily transposed provided conformability for future matrix operations is maintained. Tracing rules are maintained if it is granted that t(M) is used for optimization, and that M is re-transposed into its original form for printing upon output. This assumption is admittedly not as apparent as it could be. Conceptually, we could consider the row of the M matrix to be NA rather than 1 to fully embrace the means vector as a vector rather than a matrix. I don't think that setting the means matrix row name to NA or transposing the means matrix post-optimization are good general solutions, but they probably merit discussion.