## So many AICs, so little time. What's a summary() to do?

20 replies [Last post]
Offline
Joined: 07/30/2009

There has been an ongoing discussion on both the optimizer forum and the developer's mailing list about which AIC to use. This thread is here to gather the discussion in one place.

Since the major SEM packages differ in the way they calculate AIC, I proposed we print something like:

AIC (Mx)     = -123456
AIC (Amos)   =  654321
AIC (MPlus)  =  234567
AIC (LISREL) =  132435

Let the AIC discussions begin!

Offline
Joined: 07/31/2009
Documentation for

Documentation for SaturatedLikelihoodExplicitly?

Hi all: just trying to run summary with an explicit likelihood...

# this doesn't work...
oldLL = summary(oldFit)\$Minus2LogLikelihood
summary(fit2, SaturatedLikelihoodExplicitly= oldLL)
# I guess it wants this...
summary(fit2, SaturatedLikelihoodExplicitly= T, ...)

But then where/how to pass in the df and LL of the comparison model?

I thought this would be nice
> summary(bivCorFitSub, SaturatedLikelihoodExplicitly=oldFit)
# but no luck
#Error in likelihood - saturated : non-numeric argument to binary operator

Offline
Joined: 07/31/2009
Sorry there's a rendering bug

Sorry there's a rendering bug in the summary documentation. I'll fix that. The SaturatedLikelihood argument takes a numeric value.

summary(fit2, SaturatedLikelihood = 12345.67)

Offline
Joined: 07/31/2009
Thanks Mike: Docs did look a

Thanks Mike: Docs did look a bit funny, but I hadn't cracked it open to look.

Would it make sense to let the function know the change in df as well so not only saturated models can be compared (inferring the estimated parameters as all) but also poly-unsaturated models :-)

In that case, would it make sense to let the arguments be either

summary(fit2, SatModel = aModel, SaturatedLikelihood = n, Satdf= n)

Where either SaturatedModel or both both likelihood and df are given?

Offline
Joined: 07/31/2009
Didn't Hermine write helper

Didn't Hermine write helper functions to do these things?

Offline
Joined: 07/31/2009
i guess it falls so naturally

i guess it falls so naturally out of the saturated case, that it seems very natural to cope with this slight extension in summary...

Offline
Joined: 07/31/2009
It does depend a bit on the

It does depend a bit on the function used. With Covariance matrix (+/- vector of means) input the saturated model likelihood is very cheap to compute. With raw data input, it can be very expensive. I think we should provide the statistics in summary() when they are cheap. I am supportive of the idea of telling summary() the saturated likelihood and df (or optionally a fitted model object) for the purposes of evaluating overall fit, which is a Good Thing to do, imo.

Helper functions can be great of course, and some of this functionality can be found there. However, a clever summary() may be of greatest benefit to neophytes who will not usually be aware of the capabilities of helper functions. Perhaps it would be worth making a bit of the wiki list available helper functions, which would go some of the way towards this educational goal. Summary itself might even suggest where to go for model comparison functions...

Offline
Joined: 07/31/2009
That is all very good mike,

That is all very good mike, in which case:
1. Summary could default to computing the saturated LL/df when running on cov data, perhaps putting something like "provide LL and df for saturated model" into the currently empty output slots when running on raw data to aid discovery.

2. Summary could take a fitted model (from which the LL and df are to be extracted) or a parameter called df, which defaults to the saturated value when NA

3. Might add something like "extended reporting functions at tinyurl"

Offline
Joined: 07/31/2009
Just regarding the summary,

Just regarding the summary, where labels==NA, would it be good instead of reporting "NA", to use the dimnames?

i.e., given:
mxMatrix("Full", 1,1,T, dimnames=list("ß", c("x","y)", 0, name="beta"),

This:

   name matrix row col   Estimate  Std.Error
6  NA   beta   1   1        0.96     0.04
7  NA   beta   1   2      1.97       0.043

Could be this:
   name matrix row col   Estimate  Std.Error
6  ßx   beta   1   1         0.96        0.04
7  ßy   beta   1   2         1.97        0.043

Offline
Joined: 07/31/2009
The proposal to use dimnames

The proposal to use dimnames instead of labels when the name is NA is misleading to the users. A label of NA literally means "this parameter has no name, and therefore no equality constraints can be assigned to this parameter." To use {\beta}{x} would confuse the user into thinking the name of the parameter was {\beta}{x}.

Offline
Joined: 07/31/2009
Agreed it would be bad to

Agreed it would be bad to mislead people about what is a label and what is just a helpful name (in which case the column should not be called "name" as it is now, which may have prompted my comment.

If summary is just a dump of free cells with their label, matrix and r/c address, then that's what it is now, but there is more information that could usefully be presented in the summary:

One possibility would be another column for names (as opposed to labels)

 name matrix row col label Estimate Std.Error gamma z.A 1 1 allmeans 5.05 0.065 ß x z.A 2 1 m[1,2] 1.98 0.092 ß y z.A 3 1 femE 1.11 0.094 kappa z.A 2 2 NA 5.12 0.024
Offline
Joined: 07/30/2009

How about putting the dimnames in the row and col columns instead of the row and column numbers? Thus in your original example, (correcting syntax and removing non ASCII characters),

mxMatrix("Full", 1,1,T, dimnames=list("b", c("x","y")), 0, name="beta"),

label matrix row col   Estimate  Std.Error
6   NA   beta   b   x        0.96     0.04
7   NA   beta   b   y        1.97     0.043

Offline
Joined: 07/31/2009
Steve's suggested has been

Steve's suggested has been implemented in summary(). Use options('mxShowDimnames'=FALSE) to disable the printing of dimnames.

Offline
Joined: 08/04/2009
AIC is chi_square - 2(df), as

AIC is chi_square - 2(df), as I recall.

David

Offline
Joined: 07/31/2009

I was thinking about how to get the summary to show the significance of changes between models by letting the user include a comparison model that summary() could extract relevant data from and compute the significance of differences or just display the comparison values where these are available as in AIC.

I guess something like

summary(model1, model2)

Offline
Joined: 07/31/2009
summary(object=MxModel,

summary(object=MxModel, comparisonLL = nnn) proposal

AIC usually compares two models, which requires the ability to specify both the -2LL of a second model.

It would make summary a LOT more useful, then, if the user could optionally specify that comparison -2LL and have it taken into account when computing AIC/BIC etc.

i.e.,:

summary(object=MxModel, comparisonLL = nnn)

Also did the idea of computing the NULL model -2LL dropped off the Radar?

I think that would be valuable, and might be free in terms of time given that most user's have more processors than we can use at present when evaluating a given model.

Offline
Joined: 07/31/2009
Running

Running UnivariateTwinAnalysis_PathRaw.R
neither the supermodel nor the submodels report fit

> summary(dzModel)
bmi1            bmi2
Min.   :18.98   Min.   :19.23
...
Observed statistics:
Estimated parameters:
Degrees of freedom:
AIC:
BIC:
RMSEA:

> summary(twinAEModel)
Observed statistics:
Estimated parameters:
Degrees of freedom:
AIC:
BIC:
RMSEA:  

Offline
Joined: 07/31/2009
with, as MikeN suggested, the

with, as MikeN suggested, the formula... perhaps in the brackets

AIC (Mx: χ²- df )   = -123456
AIC (Amos ???)      =  654321
AIC (MPlus ???)     =  234567
AIC (LISREL: χ²+k?) =  132435

Offline
Joined: 02/04/2010
Different AICs

To my knowledge, the AIC is calculated as

OpenMx: chi^2 - 2*df [ref.3]
EQS: chi^2 - 2df [ref.4]
AMOS: chi^2 + 2*df [ref.4]

Onyx: -2LL + 2*k [ref.3]
R (lm) : -2LL + 2*k [ref.1]
Mplus: -2LL + 2*k [ref.2]

with -2LL being the minus two log-likelihood, chi^2 being log-likelihood ratio of the saturated model and the candidate model, k being the number of freely estimate parameters in the candidate model, df being the difference of freely estimated parameters in the saturated model and the candidate model.

Differences in AIC come out the same, either way.

I would really prefer the later calculation because
- it is consistent with R
- the chi^2-based AIC is the same as the -2LL-based AIC with an added constant that is dependent on the saturated model's properties only and drops out anyway, when comparing AIC differences.

[1] R documentation of AIC {stats}
[2] Technical Appendices of Mplus
[3] Personal experience
[4] Randall and Schumacker - A Beginner's Guide …

Offline
Joined: 07/31/2009
I like the k-based AICs

I like the k-based AICs better than the df-based ones too. But I'm not sure how the other developers feel. This way we can return AIC for FIML models without having to run the saturated model.

Offline
Joined: 07/31/2009
No problem with reporting both

There is rationale to using the number of raw observations (statistics). In part this is because the number of parameters may legitimately exceed k(k+1)/2 + k (if means and variances and covariances of k variables) without there being a negative df. Another issue is that with missing data, one might have a dataset in which every person is measured on only one occasion, although 100 occasions have measurements. In this case, all the (100*101/2) covariances are missing, so the k-based statistic would be badly misleading. Some acceptance that the analysis of missing data is not so straightforward seems to be needed. Hence I propose making both available.