mxGenerateData {OpenMx}R Documentation

Generate data based on an MxModel object


This function returns a new (simulated) data set based on the model-implied distribution provided.


mxGenerateData(model, nrows, returnModel=FALSE, use.miss = TRUE, ..., .backend=TRUE)



A data.frame or MxModel object upon which the data are generated.


Numeric. The number of rows of data to generate.


Logical. Return data.frames if FALSE or model populated with new data if TRUE.


Logical. Whether to approximate the missingness pattern of the original data.


Not used; forces remaining arguments to be specified by name.


Whether to use the backend to generate data


When given a data.frame as a model, the model is assumed to be saturated multivariate Gaussian and the expected distribution is obtained using mxDataWLS. In this case, the default number of rows is assumed to be the number of rows in the original data.frame, but any other number of rows can also be requested.

When given an MxModel, the model-implied means and covariance are extracted. It then generates data with the same mean and covariance. Data can be generated based on Normal (mxExpectationNormal), RAM (mxExpectationRAM), LISREL (mxExpectationLISREL), and state space (mxExpectationStateSpace) models.

Please note that this function samples data from the model-implied distribution(s); it does not sample from the data object in the model. That is, this function generates new data rather than pulling data that already exist from the model.

Thresholds and ordinal data are implemented by generating continuous data and then using cut and mxFactor to break the continous data at the thresholds into an ordered factor.

If the model has definition variables, then a data set must be included in the model object and the number of rows requested must match the number of rows in the model data. In this case the means, covariance, and thresholds are reevaluated for each row of data, potentially creating a a different mean, covariance, and threshold structure for every generated row of data.

For state space models (i.e. models with an mxExpectationStateSpace or mxExpectationStateSpaceContinuousTime expectation), the data are generated based on the autoregressive structure of the model. The rows of data in a state space model are not independent replicates of a stationary process. Rather, they are the result of a latent (possibly non-stationary) autoregressive process. For state space models different rows of data often correspond to different times. As alluded to above, data generation works for discrete time state space models and hybrid continuous-discrete time state space models. The latter have a continous process that is measured as discrete times.


A data.frame, list of data.frames, or model populated with the new data (depending on the returnModel parameter). Raw data is always returned even if the original model contained covariance or some other non-raw data.


The OpenMx User's guide can be found at


# Create data based on state space model.
nvar <- 5
varnames <- paste("x", 1:nvar, sep="")
ssModel <- mxModel(model="State Space Manual Example",
    mxMatrix("Full", 1, 1, TRUE, .3, name="A"),
    mxMatrix("Zero", 1, 1, name="B"),
    mxMatrix("Full", nvar, 1, TRUE, .6, name="C", dimnames=list(varnames, "F1")),
    mxMatrix("Zero", nvar, 1, name="D"),
    mxMatrix("Diag", 1, 1, FALSE, 1, name="Q"),
    mxMatrix("Diag", nvar, nvar, TRUE, .2, name="R"),
    mxMatrix("Zero", 1, 1, name="x0"),
    mxMatrix("Diag", 1, 1, FALSE, 1, name="P0"),
    mxMatrix("Zero", 1, 1, name="u"),
    mxExpectationStateSpace("A", "B", "C", "D", "Q", "R", "x0", "P0", "u"),

ssData <- mxGenerateData(ssModel, 200) # 200 time points

# Add simulated data to model
ssModel <- mxModel(ssModel, mxData(ssData, 'raw'))

# Fit model to simulated data
ssRun <- mxRun(ssModel)

# Compare parameters estimated from random data to
#  their true generating values
cbind(Rand = omxGetParameters(ssRun), Gen = omxGetParameters(ssModel))
# Note the parameters should be "close" (up to sampling error)
# to the generating values

manifests <- paste("x", 1:5, sep="")
latents <- c("G")
factorModel <- mxModel("One Factor",
      manifestVars = manifests,
      latentVars = latents,
      mxPath(from=latents, to=manifests, values=.8),
      mxPath(from=manifests, arrows=2, values=.2),
      mxPath(from=latents, arrows=2,
            free=FALSE, values=1.0),
      mxPath(from = 'one', to = manifests))

factorData <- mxGenerateData(factorModel, 100)

factorModel <- mxModel(factorModel,
                       mxData(observed=cov(factorData), type="cov",
                              means = colMeans(factorData)))
factorRun <- mxRun(factorModel)
cbind(Rand=omxGetParameters(factorRun), Gen=omxGetParameters(factorModel))

[Package OpenMx version 2.8.3 Index]