Thu, 10/04/2012 - 14:44

Our research group has done a wide variety of performance measurements on lab mice (running speed on treadmills, oxygen consumption, spontaneous activity, etc. etc. for each mouse). We also have survival and censoring information for each mouse. I believe that our measurements reflect several underlying biological processes that in turn determine the rate of aging in these animals.

I'm new to SEM, but from my reading so far it sounds like the measurements are manifest variables, the putative aging processes they report on are latent variables, and this would be exactly the sort of problem for which SEM is intended. I assume that as I read more about SEM, I will learn how to formulate candidate models for these latent variables and how determine which are best supported by the available evidence.

But then I want to use the latent variable estimates as predictor variables in a survival model, and see which ones are most predictive of hazard. In 2009 another user started a thread called "Survival model". My requirements are more flexible than his. I can use a Cox model, and experience shows that the Weibull model is not a horrible fit for mouse longevity data. Not as good a fit as the Gompertz model, but easier to work with and interpret.

At any rate, I'm not committed to a particular survival model; I am interested in searching for the best possible predictors of hazard and/or longevity given the data we have collected. Would you folks recommend extracting the estimates I obtain from OpenMx and simply using them as predictor variables in, e.g., coxph() or survreg() or does OpenMx have the ability to do survival regression within the same model that estimates these latent variables?

I think you're right that SEM would be a great tool for this project. The stuff that your manifest variables have in common could be construed as an aging variable.

I'm sure OpenMx can do survival models. It's just a matter of how much harder they'd be to do in OpenMx versus with pre-built functions like coxph() and survreg(). If I wanted something done relatively quickly, then I would run a factor model in OpenMx, calculate the factor score, and then use the factor scores in a pre-built survival model. On the other hand if I wanted something done awesomely, then I'd implement some survival models in OpenMx, publish something on the process of combining survival models with factor models, and then publish something else on the original data analysis problem. Unless you are a research methodologist, I would recommend the first option.

There was some talk about factor scores in OpenMx in this thread: http://openmx.psyc.virginia.edu/thread/1294 . There I gave a function that would take an OpenMx model and give back factor scores. I said it in the thread, and I'll say it again here: IT WOULD BE VERY EASY TO MISUSE THIS FUNCTION. The function assumes you specified a factor model, as opposed to any other kind of SEM. Factor models have some number of latent variables with only two-headed arrows between them; their manifest variables have only two-headed arrows between them or one-headed arrows in from latent variables. There can be no one-headed arrows from a latent variable to another, or from a manifest variable to another, or from a manifest variable to a latent variable. Finally, missing data would not be handle well by the the factor score function that I wrote.

In summation, I think you can do what you're suggesting. Just proceed with some caution. Good luck!