Mon, 06/27/2011 - 16:56

Motivated by http://openmx.psyc.virginia.edu/thread/996, Tim (Brick) and I have been talking about how mxPath behaves when arrows=2 and all=TRUE. The current behavior creates all possible paths, save for when excludeself is set to TRUE, in which case the paths from a variable to itself are created then ignored as best as I can figure (I wasn't there/don't remember the meeting for this spec).

There are some good reasons for this, specifically that the 'all' argument doesn't behave any differently regardless of what is put in the 'arrows' argument. However, this is somewhat inconsistent with how we treat full and symmetric matrix objects. Consider 3x3 versions of these two matrix types. Full matrices must be supplied with any number of labels provided they are an even divisor of the total number of elements: in our case, 1, 3, or 9. Symmetric matrices may only be provided with (a) a single label, (b) the number of unique elements (6, in our case) or (c) a full-rank set of elements (9), with appropriate equality constraints. This behavior should be doable in a path statement, both for internal consistency across matrix and pathic versions of a model and to make 'all=TRUE' more useable.

There is at least one major problem with having the behavior of the 'all' argument vary with the value of the arrows argument: the arrows argument accepts vectors. The code below includes a vector of values for the arrows argument, and has very puzzling behavior.

alice <- mxModel("DownTheRabbitHole", type="RAM", manifestVars=letters[1:2], mxPath(from=c("a", "b"), arrows=1:2, all=TRUE))

This code creates:

-a one-headed arrow from a to a

-a two-headed arrow from a to b

-a one-headed arrow from b to a

-a two-headed arrow from b to b

The only reason I can see to support multiple arrows values in an mxPath statement is for automatic model generation, such that you make a list of all 'from' variables in your model, a list of all 'to' variables, etc, and write an entire model in a single mxPath call. Even if you wanted to have two mxPath calls, the obvious split is with one and two headed arrows, making this feature of the arrows argument uncessary. Changing this would affect backwards compatability, so that should be discussed as well.

Thoughts?

Here are some thoughts on this issue.

# Here I define labels for the covariance.

> Labels.for.Exogenous.Covariances <- paste("Cov", rep(IVs, each=3), IVs, sep="_")

> Labels.for.Exogenous.Covariances

[1] "Cov_COREREL_COREREL" "Cov_COREREL_JOBPAY" "Cov_COREREL_hrs1" "Cov_JOBPAY_COREREL" "Cov_JOBPAY_JOBPAY" "Cov_JOBPAY_hrs1" "Cov_hrs1_COREREL" "Cov_hrs1_JOBPAY" "Cov_hrs1_hrs1"

Notice that the covariance names include names of variances too (e.g., Cov_COREREL_COREREL)

Then, with an mxPath call such as:

mxPath(from=IVs, to=IVs, arrows=2, all=TRUE, free=TRUE, values=0.5, labels= Labels.for.Exogenous.Covariances, excludeself=TRUE)

it generates

mxPath

@from: 'COREREL', 'COREREL', 'COREREL', 'JOBPAY', 'JOBPAY', 'JOBPAY', 'hrs1', 'hrs1', and 'hrs1'

@to: 'COREREL', 'JOBPAY', and 'hrs1'

@arrows: 2

@values: 0.5

@free: TRUE

@labels: Cov_COREREL_COREREL Cov_COREREL_JOBPAY Cov_COREREL_hrs1 Cov_JOBPAY_COREREL Cov_JOBPAY_JOBPAY Cov_JOBPAY_hrs1 Cov_hrs1_COREREL Cov_hrs1_JOBPAY Cov_hrs1_hrs1

@lbound: NA

@ubound: NA

@excludeself: TRUE

It isn't clear to me if the way the labels line up will be consistent with the way the covariances line up. In part, I think this is due to the @from having duplicates and the order of the pairing of the variables from from to to isn't clear. Actually, Ryne helped me put the code together that (correctly) matches the labels and the covariances (Thanks!). I was looking for a way to try to automate the correct naming of covariance terms and for their estimation for an arbitrary number of exogenous variables. The above works, but it isn't intuitive to me.

What if another mxPath option such as "among" (or a better name) were added that in a simple way produced all possible double headed arrows for all i and j where i != j? Then, perhaps no changes are necessary (and thus backwards compatibility isn't an issue) and there would simply be a new way of accomplishing the goal, which I believe is a common one, of correlations among exogenous variables. It could work like:

mxPath(among=IVs, free=TRUE, values=0.5, labels=Labels.for.Exogenous.Covariances)

which would mean that among all variables specified in the vector IVs there would be double headed arrows among all of them. There would be no need to specify from, to, all, arrows, or excludeself parameters (since the "among" parameter option would have a very specific purpose). On the back end, the ordering of the double headed arrows could be specified via combn(object, 2) as:

> IVs

[1] "COREREL" "JOBPAY" "hrs1"

> combn(IVs, 2)

[,1] [,2] [,3]

[1,] "COREREL" "COREREL" "JOBPAY"

[2,] "JOBPAY" "hrs1" "hrs1"

so that the order in which the covariances were fed to mxRun would be easily understood by the user (which would be by column). Then, making labels would be easy, since the ordering of the covariances would be well understood.

Anyway, just a thought. I appreciate your work on OpenMx, as I know many others do too.

Ken

I haven't heard any case where users would want all=T to create duplicate paths.

I think we should define the behavior of all=T to discard duplicate paths.

If we define this behavior as working only when "to" is missing, I doubt it will ever confuse anyone, and will help nearly everyone who uses it. Anyone who wants the old behavior can just supply the same list to from and to and request all.

I doubt anyone will use a vector of arrows AND all=T. But if they do, then if any arrow is 2, we could remove duplicate paths.

One solution would be to create all the paths including duplicates as it does now, then post-hoc remove anything with 2 arrows which is not a unique combination.

I just focus on the main use-case, which is generating all the covariances among a list of variables in one easy call, excluding variances if requested - making that work will be good.

This behavior from mxPath is the Rish way to do it; each argument takes a vector or a scalar, and if there aren't enough elements, it repeats whatever it does have to fill in the rest. This is the way I (and likely, other R users) would expect it to work.

I agree that the combination with

`all=TRUE`

is potentially a bit confusing, but I don't think that warrants its removal. The best way to deal with it is just not to use it if you don't like it. If you're using`all=TRUE`

, split the one- and two-headed arrows into different statements and use a scalar. It doesn't seem to confuse users who aren't interested in using it, so I see no real reason to change it there.I also agree, though, that it makes things very difficult if we want to make a special case for arrows=2. If this use case is extremely rare, I could see sacrificing it for a simpler symmetric-arrow interface, although I would be fairly hesitant to do so.

The klobbering behavior is discussed in http://openmx.psyc.virginia.edu/thread/850.

So, users: Does anyone pass vectors into the

`arrows`

argument, or is that an unused feature?Can anyone think of a way to simplify the symmetric arrows case without giving up the ability to pass

`arrows`

a vector?The

`arrows`

question could be resolved by throwing an error when`all = TRUE`

and`arrows`

is not of length 1.This seems very reasonable.

We say all=TRUE requires that all paths have the same number of arrows. Otherwise, the user can use rep() statements to do all crossings themselves, and assign the appropriate number of arrows to each one.

Sounds good to me.

I think there are examples in the recommended packages in R where parameters are not vectorised and I am sure there are situations where R will only recycle an integral number of times. So I think it would be perfectly acceptable to say that arrows is a scalar parameter and if you supply a vector it gives a warning and says it is only using the first value (similar to the behaviour of if(...))