Wed, 09/02/2009 - 14:43

How desirable is it that we continue to support matrix and algebra substitution. Matrix substitution will take a 1 x 1 matrix with name 'foo', and then any other matrix with the fixed parameter named 'foo' will get the value stored in the 'foo' matrix. Algebra substitution will take a 1 x 1 matrix with name 'foo', and then any other matrix with the fixed parameter named 'foo' will get the value computed by the 'foo' algebra. This idea is completely orthogonal to constant substitution and free parameter substitution. The target of matrix and algebra substitution is a matrix, while the target of constant substitution and free parameter substitution is an algebra.

The problem is that including matrix and algebra substitution prevents us from making the rule that free parameters, fixed parameters, and named entities all must have non-overlapping names. Personally I find matrix and algebra substitution a bit confusing. Take a look at this example that is in our test suite. Is it intuitive what is happening here?

aMatrix <- mxMatrix(values = matrix(1, 1, 1), name = 'a') bMatrix <- mxMatrix(values = matrix(2, 1, 1), name = 'b') cMatrix <- mxMatrix(values = matrix(3, 1, 1), name = 'c') dMatrix <- mxMatrix(values = matrix(4, 1, 1), name = 'd') eMatrix <- mxMatrix(labels = c('a','b','c','d'), nrow = 2, ncol = 2, byrow = TRUE, name = 'e') model <- mxModel(aMatrix, bMatrix, cMatrix, dMatrix, eMatrix) model <- mxRun(model) expected <- matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2, byrow = TRUE) omxCheckEquals(mxEval(e, model), expected)

Is there a strong use case for matrix and algebra substitution? Because otherwise I am in favor of eliminating this feature.

Circular dependencies in mxAlgebra expressions are now detected in the latest subversion revision. The following are now illegal:

More goodness: thanks Michael!

Hooray! I'm excited that we've worked out the issues for substitution. These problems existed in the earlier implementation for 1 x 1 matrices, we just ignored them. Steve and I are in agreement about everything, so without objections from the development team, we'll move forward implementing the revised proposal.

Just to repeat myself, Steve's example would be illegal, and so would the following example.

This example is illegal because we will state that A relies on B and B relies on A, even though with the configuration described the example would probably be OK. We're going to use a conservative rule (a rule with no false admissions but with false rejects) because it's simpler to implement, and once we implement A[2,] and A[,] notation we will be thankful for this rule.

No, the supermodel 'foo' has nothing declared locally except models 'model1' and 'model2'. Model1 has an algebra 'b' declared, so any fixed parameter 'b' in model1 is equal to the algebra 'b'. Model2 does not have an algebra 'b' declared, so any fixed parameter 'b' in model2 is just a fixed parameter.

While this case is surprising, it is not logically inconsistent. The most surprising part of it is that 'b' only becomes model1.b and model2.b once the two models are part of a supermodel.

However confusing this might be to first-timers, I see a great deal of power here in being able to defer interpretation based on the overall structure. This is really true about SEM in general. The parts, in many cases, are only fully known if one considers the full structure.

Confusing it could be, powerful it IS. --Yoda

Except we've already seen how matrix substitution can be implemented in R instead of in OpenMx, and Mike Neale has admitted that algebra substitution is only useful if we allow algebras that are not 1 x 1 matrices, which we don't. There is great power in the statement: free parameters, fixed parameters, and named entities all have unique names. It will eliminate a lot of bugs in user programs, and a lot of bugs in the implementation (more complexity == more bugs).

Tim Brick proposed a solution that was adopted at today's developer meeting. Matrix and algebra substitution will be invoked by a label of the form "foo[a,b]" where a and b are integer values. The square brackets are an indicator of matrix or algebra substitution, they are not allowed in free or fixed parameter names. Some day in the future, we have the opportunity to implement "foo[ , ]" which signifies substitution of the entire matrix or algebra, instead of a single element.

Not sure I get this logic. But if I do, I'm not sure I like it.

First, I agree that all named entities, free parameters and fixed parameters should have unique names/labels. But we already are allowing more than one free parameter (if by that you mean something allowed to be estimated which has a unique location in a unique matrix) to have the same label as another free parameter (another location in another matrix).

No one except those who throw things over the fence (i.e., the development team) thinks about free parameters as being "free floating", as in not tied to a particular location in a particular matrix. Some of the development team, I suspect, are using the words "free parameter" to mean an index into the parameter vector that is thrown over the fence. This is technically correct, but I suspect that few outside of the core development team would ever come up with this usage on their own. End users are more likely to think in terms like this, "there are _two_ free parameters, a[1,1] and a[2,2] and I have constrained these two parameters to be equal by calling both of them 'beta'". They would not say, "there is _one_ free parameter 'beta'".

With that in mind, from the users' standpoint we already allow non-unique labeling. I assume we are not giving that up.

A problem I see with the foo[a,b] proposal, if I understand the proposal correctly, is that it becomes an assignment operator: thus once you name an algebra or matrix foo[a,b] you are tying its output to one location in one matrix. This is quite limiting, and I suspect is not something that would be welcomed by the advanced users.

If we are going to do assignment, I think it is probably a good idea to have an actual assignment operator or function. Using a name to do assignment does not seem intuitive to me.

Rish would look more like

or perhaps even a function

This logic could then be applied to submatrices assigning to free parameters, for instance.

Or maybe I just don't understand what's proposed here.

BTW, if others are willing, I am willing to give up fixed parameter substitution from matrices. The only time I see a problem with increased probability of user error is when someone has been equating a free parameter (or several parameters) to be equal to the output of a matrix and they decide that they want to temporarily fix this (or one of these) parameter(s). Do they need to rename it? I'd be fine with saying, "yes, gentleuser, you need to temporarily rename that parameter". Then we could just drop fixed parameters entirely from the "can be constrained to values from matrices/algebras" category. After all, the whole notion of "fixed" is something that cannot change during estimation. So there is no need for it to be part of the calculation within the model.

OK let's review the terms. A named entity is either a MxMatrix, MxAlgebra, MxModel, MxConstraint, or MxObjectiveFunction object. A free parameter consists of all those (row,col) entries in a job that have the same label, such that the entry in the free matrix is TRUE. A fixed parameter consists of all those (row,col) entries in a job that have the same label, such that the entry in the free matrix is FALSE.

Yes, if there are two free parameters with the name "beta" then the development team refers to that as a single free parameter. So do the summary statistics for determining degrees of freedom, but I suspect that is the correct thing to do. Yes, if there are two fixed parameters with the name "beta" then the development team refers to that as a single fixed parameter.

Under the new proposal named entities, free parameters, and fixed parameters cannot have overlapping names. But how to we implement algebra and matrix substitution? Inside a MxMatrix, 'foo', we create a (row, col) entry where the free matrix is FALSE, and the label matrix is "A[3,5]". If A is a matrix, then we are performing matrix substitution (which is silly). If A is an algebra, then we are performing algebra substitution. This is no different than the older system, except that in the old system it was required that A was 1 x 1 matrix or algebra, and matrix or algebra substitution only worked on the (1,1) entry.

One source of confusion is that we are not naming a matrix or an algebra "A[3,5]". We are simply assigning this string to one label in the labels matrix of an MxMatrix object.

mspiegel writes: "Yes, if there are two free parameters with the name "beta" then the development team refers to that as a single free parameter. So do the summary statistics for determining degrees of freedom, but I suspect that is the correct thing to do. Yes, if there are two fixed parameters with the name "beta" then the development team refers to that as a single fixed parameter."

This will be a source of confusion for end users. I realize that it is the right way to think about it from the programmers standpoint, but the fact is, this is not the way it is discussed in any SEM textbook. So, we need to make sure we speak the language of our users, even if we have a private vocabulary that maps more directly onto the program internals. I believe this distinction may have caused some of the confusion in this thread.

My own confusion was that I inferred from the proposal that the source (say the algebra) would be named as its target, rather than the target being labeled as its source.

I can understand the intent of the proposal now. But substitution and equating are subtly different things. Recall that if we equate a fixed parameter to a free parameter by naming them the same, something has to give. If we equate a fixed parameter to another fixed parameter, then something has to give if the two have different values. All of this also applies to the A[3,5] notation if A is an mxMatrix. But now the names aren't necessarily the same. Suppose the row 3, column five element in C is labeled A[3,5] which points to a matrix element whose name is B[3,5] which points to a matrix element whose name is C[3,5]. You fix all elements in one of those matrices. Now what? There is going to be a forest of verification every time any matrix is added or changed.

I guess we are already leaving ourselves open to this problem once we allowed a parameter to refer to a 1x1 mxMatrix, since that mxMatrix element could be labeled as being constrained to be equal to something else.

In any event, I think we are in the same logical territory that we were when the "=beta" logic was shot down. We have taken a non directional graph and turned it into a directional graph.

At the very least, there should be something that clues people into the fact that this is not equating, it is assignment (if that is in fact the case). If it is just equating, then I am coming to the conclusion that fixed parameters should not participate in the equating joy, but should be named uniquely.

Matrix substitution and algebra substitution are a mechanism for assignment, not for equating. And I agree that chains are assignment are annoying, and that circular assignment is bad. What if we revise our revised proposed to require that 'A' always refer to an MxAlgebra, if we have a label to 'A[3,5]'. That would get rid of chaining and circular dependencies. And we can tell the users "you should be using fixed parameters instead of matrix substitution, anyway, so we are doing you a favor."

Well, it doesn't get rid of circular dependencies. Because A could be an algebra that relies on matrix M, but the (2,3) entry of M could refer to the (1,1) entry of A.

Agreed, restricting to algebras does not remove the problem, only make people create an algebra that contains a matrix if they want to assign from a matrix.

If we are going to allow assignment (I was actually in favor of this way back when) then I'd recommend we modify the original assignment proposal to include subscripts: =A or =A[1,1] or =beta as assignment. A or A[1,1] or beta for equality constraint.

This does raise all of the problems that we talked about earlier. Assignment is a bear.

But at least we are distinguishing between assignment and equality constraints.

There is not a need for the "=" notation. What we should be doing is admitting that we are performing equality constraints and then proceed. The alternative is to either not allow chaining (which is limiting), or admit that there is an unspecified order to the assignment operations (which is unsatisfactory).

Assume that A is either a matrix or an algebra, and beta is either a free parameter or fixed parameter. The proposal under discussion refers to the expressions found in the labels matrix. The A[1,1] notation will be referred to as "cell referencing", for lack of a better term.

Two changes must happen in order to implement this type of equality constraint. First of all, in the back-end we currently treat MxMatrices as the bottom of the dependency tree for evaluation. We must modify this implementation such that MxMatrices that contain cell references must first evaluate their references. Second, in the front-end we need some sort of check for circular references. This needs to be implemented for algebras, too. Try the following example, it will infinite loop on you:

My changing the meaning of cell references to be defined as evaluation, we are forbidding circular references between two entities. This means that if A and B are 3 x 3 matrices, and the (3,3) label of A is "B[3,3]" and the (1,1) label of B is "A[1,1]" then we declare the model illegal.

OK. I agree with Michael's proposal. No assignment. Equality constraints is all we are doing. mxMatrices are the bottom of the dependency tree. No chaining. No self-referencing. "Cell referenced equality constraints" or "Index referenced equality constraints" both sound OK to me.

I read this proposal to mean that the following is illegal:

I'm willing to live with these restrictions. Otherwise we have gaping potholes where people can fall in up to the axle.

> A <- mxAlgebra(A, name = 'A')

Should an mxAlgebra be allowed to have its own name appear in its algebra?

nice idea! and future-ready

It seems like matrix substitution could just be done in R. I guess the question is whether there is a case for algebra substitution? Remember you can use algebra substitution only when the source is a 1 x 1 algebra, so to populate an entire matrix with r x c unique entries, you would need r x c unique algebras.

Matrix substitution in R:

Here's another confusing example:

So the value of the model1.c is a 1 x 1 matrix that is the result of computing C + D. The value of model2.c is a 1 x 1 matrix that contains pi. And yet both of these cells are labelled 'b', but one refers to the algebra 'b' and the other to the fixed parameter 'b'.

hi michael,

Is that more about the generic problem of not know what labels are in use where?

Also, one could imagine the case where the goal might be to equate solve for C+D= pi and this would implement it.

Just as coders develop a set of prefixes to remind them of the type of data (i usually stick "b" and "a" in front of boolean and array names), users could contain this confusion themselves by adopting a nomenclature: say prefixing the name of algebras with "alg" and matrices with "mat" or just leave them as is.

Or am I missing something?

In model1, we have an 1 x 1 algebra named 'b'. Therefore any fixed parameters in model1 with name 'b' do not use their own values, they use the value of the 1 x 1 algebra. This is an instance of algebra substitution. In model2, we have no algebra named 'b'. Therefore the fixed parameter 'b' uses its own value. Now try explaining all this to a group of first-time OpenMx users.

But won't fixed parameter b in model2 also have to use the value of the algebra in model1? In which case it seems pretty explicable to me: "everything called b is going to have the same value".

I guess that this just highlights the odd effect of fixed if fixed can mean "fixed to the value of something else, which may be free".

Did that get decided? Maybe need three terms for free= (free|dependent|fixed)

caio,

t

"I said I am the decider, not the explainer" George "OpenMx" Bush :-)

No, the fixed parameter in model2 would use the value of the algebra in model1 if-and-only-if the label said "model1.b". And this is precisely why I want to get rid of algebra and matrix substitution, because it's very easy to build a model that's wrong.

ahh.

I was assuming that everything in the supermodel space with the same label would be equated by default.

I note too that mxModel has independent set to NA by default. Does that affect what happens here?

I think it is useful, but would be much more useful if one could populate entire blocks of matrices with algebras that generate greater than 1x1 results.

The large script I sent you, which uses constraints, could I think be much more succinctly coded if the algebras were to go directly into the elements of the A and S matrices. I coded the script the way I did because I was not at all sure whether it would work to have algebras as paths in mxPath commands.

Further to this, the whole script is a big mess because of assortative mating being a different kind of path (a copath which has arrows at neither end). There is a matrix algebra (the Pearson Aitken formula) for dealing with such paths. I am hoping that we could implement arrows=0 for this purpose. It's a pretty general framework for handling the effects of selection. Assortative mating can be thought of as generating a covariance between variables by a partial sort which does not change the variances. However, a partial sort on say variables A and B has effects on the covariances of i) variables caused by A or B; ii) variables causing A or B; and iii) variables correlated with A or B for whatever reason. This is a big enough item to be part of a grant renewal.

no-arrow paths would be great! good for DF people too.

matrix substitution can be done in R, but it is handy, and explicit to be able to say

The target of matrix and algebra substitution is a matrix, while the target of constant substitution and free parameter substitution is an algebra. This is an example of constant substitution.