Fri, 09/16/2011 - 18:18

At developers meeting on 9/16 we discussed the following:

- The group updated the status of the effort to parallelize the FIML Objective function. Several issues exist:
- RMPI uses point-to-point communication when it should not. As a result it is not as efficient as it should be.
- MPI is a library not an extension, thus we need to link against it. In order to do so a 300 line script to determine the implementation of MPI running on the host machine is required. Each implementation of MPI provides a wrapper which would allow us to link against the library with two lines of code if we were allowed to change the c compiler used in R. The group has asked the R-Developers to change this but they are resistant.

- The group discussed the status of Swift. Currently Tim Armstrong is setting up several OpenMx benchmarks to run with Swift on a 500-1,000+ core Cray environment. The benchmarks will be scaled upwards via (1) # of models, (2) size of the data set and (3) the number of free parameters. It has been observed so far that a lot of the overhead in the benchmarks is due to marshaling and unmarshaling arguments. The group agreed that there is room to reduce this overhead.
- Mike Hunter is debugging the LISREL front end and making progress.
- The group discussed adding a weighted least squares objective function to the back end. It is believed that this would doable and made easier with several R examples that make the mathematics involved very clear. Such examples would serve as a reference and provide the group with timing numbers to compare the two approaches.
- Changes are being made to the OpenMx Documentation to improve readability of examples. The group also decided to add a place on the website for a beta-version of the user guide that would describe the functionality of the current beta release of OpenMx.
- Additional information is being added to the warning and error messages for mxMatrices. The new information will specify the exact call to mxMatrix that caused the warning or error. This is particularly useful for OpenMx users who write matrix models in the monolithic style where many different calls are evaluated at once. The new functionality will allow users to see exactly which calls are causing errors or warnings.
- The function omxCheckWarning is being developed and soon will be added to the trunk. It allow developers to check if the appropriate warning is being produced for in OpenMx test cases. It will mimic the behavior of omxCheckError for warnings.

According to my current level of understanding, there are several kinds of least squares objective functions.

Generalized least squares (GLS; AKA generalized weighted least squares, GWLS) involves F_GWLS = 0.5*tr(t(G)%*%G) = 0.5*sum(G*G) where G = solve(W)%*%(S-C) with S the observed covariance matrix, C the expected (model-implied) covariance matrix, and W a weight matrix.

Unweighted least squares is GLS with W set to the identity matrix, I.

Normal theory generalized least squares is GLS with W set to S.

Normal theory maximum likelihood is asymptotically (i.e. as number of rows goes to infinity) GLS with W set to C.

Weighted least squares (AKA Asymptotically Distribution Free) is t(v)%*%solve(W)%*%v where v is Vec(S-C), i.e. turning the diagonal and lower triangle of S-C into a vector. At the present, I do not know how W is chosen in this case.

I know Michael W. Browne has several papers on this (Browne, 1974; 1977; 1984).

Browne, M. W. (1974). Generalized Least Squares Estimators in the Analysis of Covariance Structures. South African Statistical Journal, 8, 1–24.

Browne, M. W. The Analysis of Patterned Correlation Matrices by Generalized Least Squares. British Journal of Mathematical and Statistical Psychology, 1977, 30, 113-124.

Browne, M. W. (1984), "Asymptotically Distribution-Free Methods for the Analysis of Covariance Structures," British Journal of Mathematical and Statistical Psychology, 37, 62–83.

The fourth order moment matrix of the variables is usually used as the ADF weight matrix. The sem.additions package by Jarrett Byrnes has a function to calculate it. https://r-forge.r-project.org/projects/sem-additions/ It also includes two references on it:

Bentler, P.M. and Dudgeon, P. 1996. Covariance structure analysis: Statistical practice, theory, and directions. Annual Review of Psychology. 47:563-592.

Satorra, A. and Bentler, P. M. 1988. Scaling corrections for chi-square statistics in covariance structure analysis. American Statistical Association. Proceedings of the Business and Economic Statistics Section. 308-313.

Just an aside. The performance of ADF/WLS is usually very bad unless the sample sizes are huge. It may be useful to include robust statistics for it.

The following is an example:

library(MASS)

library(sem.additions)

set.seed(1030)

Sigma <- matrix(c(10,3,3,2),2,2)

x <- data.frame( mvrnorm(n=1000, rep(0, 2), Sigma) )

adf.wmat(x)

# X1_X1 X1_X2 X2_X2

#X1_X1 190.69742 59.23515 18.536141

#X1_X2 59.23515 28.95822 11.648579

#X2_X2 18.53614 11.64858 7.218111