Mon, 02/17/2014 - 15:06

This may be an easy question, but I can't think of the answer. I've built a regression model (predicting Y) with two independent variables (X and Z). I want to compute R^2. Is that built into the model somehow? If not, any ideas on how to compute it? Here's the model I have:

multiRegModel <- mxModel("Multiple Regression, All Variables",

type="RAM",

manifestVars=c("x", "y", "z"),

# variance paths

mxPath(

from=c("x", "y", "z"),

arrows=2,

free=TRUE,

values = c(.5, .5, .5),

labels=c("varx", "residual", "varz")

),

# covariance of x and z

mxPath(

from="x",

to="z",

arrows=2,

free=TRUE,

values=0.2,

labels="covxz"

),

# regression weights

mxPath(

from=c("x","z"),

to="y",

arrows=1,

free=TRUE,

values=.5,

labels=c("betax","betaz")

),

# means and intercepts

mxPath(

from="one",

to=c("x", "y", "z"),

arrows=1,

free=TRUE,

values=c(.5, .5),

labels=c("meanx", "beta0", "meanz")

)

) # close model

I know "residual" is the residual variance of Y, but I would think I'd need an estimate of the total variance of Y to compute it from that. Any help would be appreciated. Thanks!

Thanks for the input Ryne. I'm still having trouble. Below I have a reproducible example. I expected openmx and lm to agree on the value of R squared, but they don't. I've tried doing it with and without the covariance between x and z and they still don't agree. Any help would be appreciated.

#### simulate data

n = 10000

pop.mat = matrix(c(1, .48, .36, .48, 1, .35, .36, .35, 1), nrow=3)

d = data.frame(mvrnorm(n, mu=rep(0, times=3), Sigma=pop.mat))

names(d) = c(letters[c(26, 25, 24)])

### build openMX model

multiRegModel <- mxModel("MR",

type="RAM",

manifestVars=c("x", "y", "z"),

# variance paths

mxPath(

from=c("x", "y", "z"),

arrows=2,

free=TRUE,

values = c(.5, .5, .5),

labels=c("varx", "residual", "varz")

),

# covariance of x and z

mxPath(

from="x",

to="z",

arrows=2,

free=TRUE,

values=0.2,

labels="covxz"

),

# regression weights

mxPath(

from=c("x","z"),

to="y",

arrows=1,

free=TRUE,

values=.5,

labels=c("betax","betaz")

),

# means and intercepts

mxPath(

from="one",

to=c("x", "y", "z"),

arrows=1,

free=TRUE,

values=c(.5, .5),

labels=c("meanx", "beta0", "meanz")

)

) # close model

### estimate the model

m = mxModel(multiRegModel, mxData(observed=d, type="raw"))

mod = mxRun(model=m, silent=TRUE, suppressWarnings=TRUE)

### compute R squared

par = summary(mod)$parameters[,c("name","Estimate")]

res = par[par$name=="residual",2]

betax = par[par$name=="betax",2]

varx = par[par$name=="varx",2]

betaz = par[par$name=="betaz",2]

varz = par[par$name=="varz",2]

covxz = par[par$name=="covxz",2]

SST = (res + betax^2*varx + betaz^2*varz + betax*betaz*covxz)*(n-1)

SSR = res*(n-3)

Rsq = 1-(SSR/SST)

regMod = lm(y~x + z, data=d)

summary(regMod)$r.squared

Rsq

SST = (res + betax^2*varx + betaz^2*varz + betax*betaz*covxz)*(n-1)

SSR = res*(n-3)

Rsq.1 = 1-(SSR/SST)

par = summary(mod2)$parameters[,c("name","Estimate")]

res = par[par$name=="residual",2]

betaz = par[par$name=="betaz",2]

varz = par[par$name=="varz",2]

SST2 = (res + betaz^2*varz)*(n-1)

SSR2 = res*(n-2)

Rsq.2 = 1-(SSR2/SST2)

partial = sqrt((Rsq.1-Rsq.2)/(1-Rsq.2^2))

list(partial = partial, rsq1 = Rsq.1, rsq2=Rsq.2, SST=SST, SSR=SSR)

Adding a few lines to your example gives me numbers that match to my satisfaction (4-6 decimals).

What do you think?

Your output has an expected covariance in the output slot (model@output$algebras$modelname.fitfunction, subbing model and modelname as necessary). 1-resid/total should yield R2

Alternatively, you could figure it out by hand if that's easier. Total variance of Y in this model would be:

Var(Y) = residual + betax^2 varx + betaz^2 varz + betax betaz covxz

There's probably an easy function to write, at least for RAM models.