Mon, 09/28/2009 - 19:30

Just wanted to get some discussion going.

The best method I have seen for diagramming multilevel SEM appears in Mehta and Neale (2005). Their method is very compact, and maps completely onto the matrix expression for two-level models. Random slopes are represented as latent variables with definition variables as loadings. The method has the further benefit of being "collapsible" in that variables treated in a redundant manner can be collapsed into single variables without becoming inconsistent with the matrix expression.

This method works well for 2-level models with manifest predictors, but if we wish to diagram a random slope for a latent predictor, there's nothing concrete to "put" in the loadings, so the method is not as general as one would hope. Other traditions represent multilevel SEM with distinct "within" and "between" submodels, and mark random coefficients in the "within" model with darkened circles, corresponding to latent variables in the "between" model. This notation is compact, and complete to those who know how to interpret the diagrams, but the mapping between matrix expression and diagram is not as 1-to-1 as for simpler models.

Another problem area involves diagramming latent variable interactions. I take it we don't want to have arrows pointing at other arrows (although arrows pointing at random slopes is okay if they are represented as latent variables), but neither do we want products of latent variables in the diagrams (or do we?).

About a year ago Tim Brick and I made a graphics proposal and passed it around.

We were pushing at the boundaries of how one might think about multilevel and multigroup models. For instance, I believe that there is a difference in complexity of thought associated with models represented in Steve Raudenbush's HLM formulation compared with other matrix formulations. In my opinion, this is one reason that HLM caught on even though the program itself is absurdly difficult to use.

So, with that in mind, we tried to come up with a representation that included grouping variables, equality constraints and a way of thinking about a parameter at one level being treated as a variable at another level. We also wanted to include mxAlgebras and mxConstraints on the diagram.

So, we need a way to represent (at least) the following:

1. Latent variables, manifest variables, constants, regression coefficients, covariances, variances, means. (probably using the current RAM heuristic)

2. Definition variables

3. Fixed versus free paths.

4. Equality constraints between paths and named entities.

5. Higher order levels (i.e. grouped levels in a multilevel sense)

6. Variables at grouped levels that are paths (parameters) at lower (ungrouped) levels.

7. mxMatrices not in the path diagram.

8. mxAlgebras.

9. mxConstraints.

Given these requirements, Tim and I came up with the following representation, which I have modified:

The N=100 seems a bit problematic to me - if there are 10 groups and they vary in sample size? Or are there 10 copies of the left hand figure, hidden behind it? In this latter case, perhaps b1|G=1 and b1|G=2 etc would denote the group of which we diagram? This might be a bit more explicit and permit some additional flexibility, but would not generalize well to 100's of groups.

Some delineation between what is fixed and free (in Mx1, fixed stuff was by default text colored red, free stuff colored blue) would help a lot.

Overall, it's very good though. What about representing a) threshold variables, and b) mixture distributions? The absence of these features in Mx1 stifled the GUI's popularity considerably. Ime, any time you can't do something with the GUI that you can with the script, the GUI suffers a big popularity hit.

This way of setting up the diagram was designed to represent a multilevel rather than a generalized multigroup model. That is to say, in a multlevel model one does not expect the structure or constraints in the level one model to change based on the individual. Instead, the definition variable is treated as a grouping variable to allow one or more parameters to vary by individual.

A multigroup version of this would likely be better represented by a hierarchical model as we have been doing in the BG type models. But this thread is about multilevel SEM, so I'm trying to address that topic here. We probably ought to start threads for multigroup, categorical, and mixture distribution diagrams.

While color is useful and I am certainly not opposed to its use, it cannot in general be used in publication.

I somewhat suspect that there is a role for being able to map a latent variable onto a path. Technically, this might represent computing the likelihood by integrating over all values of the moderating latent variable (MLV), conditional on the data and the values of parameters of the model which relate to the MLV.

We also had a pretty long pre-website discussion on the topic of what the graphical interface should look like for truly multilevel models. I personally wasn't 100% happy with the result, but it was heading in useful directions. It would be good to get that thread up here as a thread in and of itself. Or perhaps to reconstruct it below...