Tue, 05/04/2010 - 11:06

Currently the svn repository has an implementation of likelihood-based confidence intervals. For the moment, the interface looks like the following:

` model <- mxModel(model, mxNormalInterval(c('A', 'C[,]', 'E[1,1]'), 0.95, 0.95))`

The first argument to mxNormalInterval() is a vector of strings. These could be either matrix/algebra names or free parameter names. The second argument is the lower confidence level, and the third argument is the upper confidence level.

Is this a reasonable interface? Is there a need for asymmetric confidence levels in this interface? Should the confidence levels be expressed in units of 0-100 instead of 0-1?

The full example is present here.

I'm also getting a lot of code reds. The actual model seems to run OK, but the CI's don't. I receive a set of upper and lower bounds, but they don't straddle the estimate anywhere evenly.

Am I doing something wrong with the CI code? I'm wanting to calculate the CI's on 2 variables (right and left eyes) as follows:

> # Calculate CI

> CImodel <- mxModel(multiCholAEFit, mxCI(c('ACE.A[1,1]','ACE.A[2,2]'), 0.95, 'both'))

> CImodelfit<-mxRun(CImodel,intervals=T)

Running multiCholAE

Warning messages:

1: In model 'multiCholAE' while computing the lower bound for 'ACE.A[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)

2: In model 'multiCholAE' while computing the lower bound for 'ACE.A[2,2]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)

3: In model 'multiCholAE' while computing the upper bound for 'ACE.A[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)

4: In model 'multiCholAE' while computing the upper bound for 'ACE.A[2,2]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED)

5: In model 'multiCholAE' NPSOL returned a non-zero status code 1. The final iterate satisfies the optimality conditions to the accuracy requested, but the sequence of iterates has not yet converged. NPSOL was terminated because no further improvement could be made in the merit function (Mx status GREEN).

Another query regarding CI's. Is it possible to calculate the CI around the result of a constraint? e.g. If I constrain A[1,1] and A[2,2], how can I work out the interval around that value?

Thanks a lot.

Paul

It is not uncommon for the confidence interval calculation to return nonzero status codes from the optimizer. I'll let the numerical analysis people on the forums explain the details as to why. (I would give a poor explanation)

In OpenMx 0.3.1, the back-end attempts to retry the CI calculation a fixed number of times whenever a nonzero status code is encountered. That number is currently 5. In the next binary release, you'll be able to control the maximum number of retries using the command:

`model <- mxOption(model, "CI Max Iterations", <i>n</i>)`

where

nis the number of retries.Thanks to all the comments above, but I still haven't figure out how to get CIs in output. Instead, I got an error.

Here is my script and error message I got:

univACEmodel =mxModel(

...

mxAlgebra( expression=a %*% t(a), name="A" ),

....

mxCI(c('A', 'C', 'E'))

)

univACEFit <- mxRun(univACEModel, intervals=T)

Error: Unknown reference to 'A' detected in a confidence interval specification in model 'univACE' in mxRun(univACEModel, intervals = T)

So, what reference should I use in mxCI? And how can I get the CIs in output? Thanks a lot!

Try evaluating

`names(univACEmodel)`

in R and see if 'A' appears as one of the named entities in your model. Perhaps you are using one of our scripts as a starting point, and the script contains three child models. If the A, C, and E matrices are contained in a child model named 'common', then you need to use`mxCI(c('common.A', 'common.C', 'common.E'))`

.Thanks, it is solved! However, I got the same errors as Paul, i.e. NPSOL returned a non-zero status code 6 while computing the lower bound for ... Well, the estimates of CIs look fine.

I tried the mxOption(model, "CI Max Iterations", n). I set n as 10 and then 100, but still got those errors. I am wondering is there any other solution to this?

Probably there isn't any solution other than something manual. If you re-fit the model with all parameters free, but the parameter whose CI you want to check fixed to its CI bound (upper or lower) then the fit of this model should be approximately 3.84 -2lnL units worse than the model in which all parameters are free. The 3.84 applies only if you have requested 95% CI's; other values would be expected for, e.g., 50% CI's.

Maybe we should amend the codes reported by CI calculations. It's very common to get code reds--we even retry the search several times to try to avoid it, and we still see several.

If the easy double-check is the distance from the objective function value to the appropriate critical value, why don't we report something about that distance for code red confidence interval calculations? In theory, it should be very close to exactly right. Computationally, it's not much more expensive to keep that information around, and we could then throw a more informative warning if we're not finding a likely valid interval.

For starters, I'd recommend simply amending the "Code Red" status for Confidence Interval boundaries to check the distance metric, and if it's above some cutoff, report that the interval was likely not calculated correctly.

Thoughts?

It's a good idea to some extent. However, there are cases in which boundary conditions will throw things off. So if a parameter x has an lbound of zero, and if the parameter at this value is say 2.5 units of -2lnL worse than where it started, the lower CI is indeed zero. The optimizer will, however, have strayed from the ML position (only 2.5 units worse) and actually degraded the fit further (by moving the other parameters away from what their MLE's would be if x was fixed at zero). So some flagging of when parameters are at bounds would be useful in conjunction with it.

I do see a lot of folks antsy about code Red. It is a poor diagnostic, much more often false positive than true positive in my experience. So hiding this a bit from the user might be beneficial, but at the same time it should be available for inspection.

Distance of the parameter from its MLE is only helpful in cases in which it got stuck at its MLE (and this value was not its lbound or ubound). We might make use of the asymptotic standard errors as a rough check, although these are provided only for free parameters, not for functions thereof. Some matrix calculus software would help with getting SE's of functions of parameters but that's at least version 2.0 down the road, I reckon. Note also that for computed elements we may not be able to "know" that it's effectively a boundary (x^2 would have lower bound of 9 if the lbound of x was 3, for example).

I agree it is probably better not throw the optimiser warnings (code green or red) by default: just trains people to ignore them if they are mostly false.

Most people will run summary on the fit... Perhaps put a "health" parameter into the summary, which would have the model "color" in it. Then it can be seen, without being dumped into the console, and people can followup on it if necessary?

In the svn repository, this function has been renamed to mxCI(). It now has the arguments mxCI(reference, interval = 0.95, type = c('both', 'lower', 'upper')). The previous example would be specified as either:

`model <- mxModel(model, mxCI(c('A', 'C[,]', 'E[1,1]'), 0.95, 'both'))`

or,

`model <- mxModel(model, mxCI(c('A', 'C[,]', 'E[1,1]')))`

or without showing off array notation,

`model <- mxModel(model, mxCI(c('A', 'C', 'E')))`

First, let me say that I am a novice. I think, however, that I've figured out how to get confidence intervals for most everything except the genetic and environmental correlations -- I'm not sure what to reference exactly. I think the matrix below is what I should reference, but I get errors because of illegal '.' characters and I'm not convinced I'm doing this correctly anyways. Here is the bit of script that I think sets up the correlation matrices:

# Genetic and Environmental Correlations

AEcorMatrices <- c("solve(sqrt(ACE.I*ACE.A)) %*% ACE.A %*% solve(sqrt(ACE.I*ACE.A))",

"solve(sqrt(ACE.I*ACE.E)) %*% ACE.E %*% solve(sqrt(ACE.I*ACE.E))")

AEcorLabels <- c("corComp_A","corComp_E")

Any advice would be greatly appreciated.

Many thanks in advance,

jm

Hi. We need to see more of the script in order to determine where the error appears. Can you attach either the script or a subset of the script to a forum post. If the model uses a larger dataset then either simulate some data or attach some meaningless data that allows the model to run.

Thanks so much for your quick reply. Below is a good chuck of the script. I'm struggling with the last bit that starts with CImodel under the AE model part. When I run this script, everything works fine except the last part and R returns an error:

> CImodel <- mxModel(AEcorMatrices, mxCI(c('corComp_A', 'corComp_E')))

Error: The name 'solve(sqrt(ACE.I*ACE.A)) %*% ACE.A %*% solve(sqrt(ACE.I*ACE.A))' and 'solve(sqrt(ACE.I*ACE.E)) %*% ACE.E %*% solve(sqrt(ACE.I*ACE.E))' is illegal because it contains multiple '.' characters

> CImodelfit<-mxRun(CImodel,intervals=T)

Error in cat("Running", model@name, "\n") : object 'CImodel' not found

> CIAEsumm<- summary(CImodelfit)

Error: object 'CImodelfit' not found

Error in summary(CImodelfit) :

error in evaluating the argument 'object' in selecting a method for function 'summary'

> CIAEsumm

Error: object 'CIAEsumm' not found

Model script starts here...

# Fit Multivariate ACE Model using Cholesky Decomposition

# -----------------------------------------------------------------------

multACEModel <- mxModel("multACE",

mxModel("ACE",

# Matrices a, c, and e to store a, c, and e path coefficients

mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=.3,label=c("a11", "a21", "a22"), name="a" ),

mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=.3,label=c("c11", "c21", "c22"), name="c" ),

mxMatrix( type="Lower", nrow=nv, ncol=nv, free=TRUE, values=.3,label=c("e11", "e21", "e22"), name="e" ),

# Matrices A, C, and E compute variance components

mxAlgebra( expression=a %*% t(a), name="A" ),

mxAlgebra( expression=c %*% t(c), name="C" ),

mxAlgebra( expression=e %*% t(e), name="E" ),

# Algebra to compute total variances and standard deviations (diagonal only)

mxAlgebra( expression=A+C+E, name="V" ),

mxMatrix( type="Iden", nrow=nv, ncol=nv, name="I"),

mxAlgebra( expression=solve(sqrt(I*V)), name="isd"),

## Note that the rest of the mxModel statements do not change for bivariate/multivariate case

# Matrix & Algebra for expected means vector

mxMatrix( type="Full", nrow=1, ncol=nv, free=TRUE, values= 20, name="M" ),

mxAlgebra( expression= cbind(M,M), name="expMean"),

# Algebra for expected variance/covariance matrix in MZ

mxAlgebra( expression= rbind ( cbind(A+C+E , A+C),

cbind(A+C , A+C+E)), name="expCovMZ" ),

# Algebra for expected variance/covariance matrix in DZ, note use of 0.5, converted to 1*1 matrix

mxAlgebra( expression= rbind ( cbind(A+C+E , 0.5%x%A+C),

cbind(0.5%x%A+C , A+C+E)), name="expCovDZ" )

),

mxModel("MZ",

mxData( observed=mzData, type="raw" ),

mxFIMLObjective( covariance="ACE.expCovMZ", means="ACE.expMean", dimnames=selVars )

),

mxModel("DZ",

mxData( observed=dzData, type="raw" ),

mxFIMLObjective( covariance="ACE.expCovDZ", means="ACE.expMean", dimnames=selVars )

),

mxAlgebra( expression=MZ.objective + DZ.objective, name="minus2sumll" ),

mxAlgebraObjective("minus2sumll")

)

multACEFit <- mxRun(multACEModel)

multACESumm <- summary(multACEFit)

multACESumm

# Generate Multivariate ACE Output

# -----------------------------------------------------------------------

parameterSpecifications(multACEFit)

expectedMeansCovariances(multACEFit)

tableFitStatistics(multTwinSatFit, multACEFit)

# Generate List of Parameter Estimates and Derived Quantities using formatOutputMatrices

ACEpathMatrices <- c("ACE.a","ACE.c","ACE.e","ACE.isd","ACE.isd %*% ACE.a","ACE.isd %*% ACE.c","ACE.isd %*% ACE.e")

ACEpathLabels <- c("pathEst_a","pathEst_c","pathEst_e","sd","stParEst_a","stPathEst_c","stPathEst_e")

formatOutputMatrices(multACEFit,ACEpathMatrices,ACEpathLabels, Vars, 4)

ACEcovMatrices <- c("ACE.A","ACE.C","ACE.E","ACE.V","ACE.A/ACE.V","ACE.C/ACE.V","ACE.E/ACE.V")

ACEcovLabels <- c("covComp_A","covComp_C","covComp_E","Var","stCovComp_A","stCovComp_C","stCovComp_E")

formatOutputMatrices(multACEFit,ACEcovMatrices,ACEcovLabels, Vars, 4)

# Genetic and Environmental Correlations

ACEcorMatrices <- c("solve(sqrt(ACE.I*ACE.A)) %*% ACE.A %*% solve(sqrt(ACE.I*ACE.A))",

"solve(sqrt(ACE.I*ACE.C)) %*% ACE.C %*% solve(sqrt(ACE.I*ACE.C))",

"solve(sqrt(ACE.I*ACE.E)) %*% ACE.E %*% solve(sqrt(ACE.I*ACE.E))")

ACEcorLabels <- c("corComp_A","corComp_C","corComp_E")

formatOutputMatrices(multACEFit,ACEcorMatrices,ACEcorLabels, Vars, 4)

# Fit Multivariate AE model

# -----------------------------------------------------------------------

multAEModel <- mxModel(multACEFit, name="multAE",

mxModel(multACEFit$ACE,

mxMatrix( type="Lower", nrow=nv, ncol=nv, free=FALSE, values=0, name="c" ) # drop c at 0

)

)

multAEFit <- mxRun(multAEModel)

multAESumm <- summary(multAEFit)

multAESumm

# Generate Multivariate AE Output

# -----------------------------------------------------------------------

parameterSpecifications(multAEFit)

tableFitStatistics(multACEFit, multAEFit)

formatOutputMatrices(multAEFit,ACEpathMatrices,ACEpathLabels,Vars, 4)

formatOutputMatrices(multAEFit,ACEcovMatrices,ACEcovLabels,Vars, 4)

# Genetic and Environmental Correlations

AEcorMatrices <- c("solve(sqrt(ACE.I*ACE.A)) %*% ACE.A %*% solve(sqrt(ACE.I*ACE.A))",

"solve(sqrt(ACE.I*ACE.E)) %*% ACE.E %*% solve(sqrt(ACE.I*ACE.E))")

AEcorLabels <- c("corComp_A","corComp_E")

formatOutputMatrices(multAEFit,AEcorMatrices,AEcorLabels, Vars, 4)

CImodel <- mxModel(AEcorMatrices, mxCI(c('corComp_A', 'corComp_E')))

CImodelfit<-mxRun(CImodel,intervals=T)

CIAEsumm<- summary(CImodelfit)

CIAEsumm

Hi. The first argument to mxModel should be a model object*. Except when the first argument is a string. I will add error checking to make sure the string is of length one. In your case the first argument should be a model. you are trying to create several algebras and use the mxAlebra() function. The confusion is because format output accepts strings which I don't like. Perhaps it should accept mxAlgebra and use deparse() to stringify them. (sent from my iPhone)

OK. So, if I replace AEcorMatrices with multAEFit, I'm still not sure what to put in mxCI(c('???', '???'))) to get it to run on the genetic and environmental correlations created in

# Genetic and Environmental Correlations

AEcorMatrices <- c("solve(sqrt(ACE.I*ACE.A)) %*% ACE.A %*% solve(sqrt(ACE.I*ACE.A))",

"solve(sqrt(ACE.I*ACE.E)) %*% ACE.E %*% solve(sqrt(ACE.I*ACE.E))")

AEcorLabels <- c("corComp_A","corComp_E")

I used the code you posted last month to generate CIs for A and E. It looks like this

CImodel <- mxModel(multAEFit, mxCI(c('ACE.A', 'ACE.E')))

CImodelfit<-mxRun(CImodel,intervals=T)

CIAEsumm<- summary(CImodelfit)

CIAEsumm

This worked fine. I'm just not sure what to put in place of ACE.A and ACE.E to generate CIs for the correlations because the correlations seem to include a bunch of solve () functions. Thanks.

I think this problem is because of the differences between the way things are specified for the formatOutputMatrices helper function, and the way that mxAlgebras are specified in an mxModel.

In order to calculate confidence intervals for these calculations, you'll want to add them to the model as mxAlgebra statements, and then calculate the CIs for those algebras.

Try something like this:

It's worth noting that there's currently a bug in the estimation of confidence intervals that causes the calculation of some types of algebra CIs to fail. If you get the same value for the upper bound and lower bound, that's a result of the bug.

There's a fix that's currently in testing in the repository version of OpenMx, which should be in the next binary release.

thanks for your input. here's what i tried:

corComp1_A <- mxAlgebra(solve(sqrt(ACE.I*ACE.A))%*%ACE.A%*% solve(sqrt(ACE.I*ACE.A)), name="corComp1_A")

corComp1_E <- mxAlgebra(solve(sqrt(ACE.I*ACE.E)) %*% ACE.E %*% solve(sqrt(ACE.I*ACE.E)), name="corComp1_E")

it returned this error:

Error: unexpected input in "corComp1_A <- mxAlgebra(solve(sqrt(ACE.I*ACE.A))%*%ACE.A%*% solve(sqrt(ACE.I*ACE.A)), name="corComp1_A")‚"

i tried this with the corComp_A and corComp1_A names just in case corComp_A was a repeat variable it didn't like me using again. it returned the same error either way.

Not sure if this is relevant, but the error you copied into your message has a comma at the end of the line. whereas the input you say you tried does not have a comma. The message you are getting looks like an R preprocessing parse error, something that may not have even made it to OpenMx.

I say this because when I run the lines you say you tried, I don't receive any errors.

thanks so much for your note. does anyone have any sense why R would choke on the syntax from above? thanks so much.

you can't have a trailing comma... it implies that another parameter is coming but then doesn't provide it.

This is a common error generated when executing lines out of context, so it is a good idea to make looking for such orphaned commas in your code one of the first checks when a syntax error is flagged and the code looks ok.

Thanks so much. After some re-checking (at your suggestion), it turned out that I was victim to the old 'don't copy directly from a website and try to run code in a programming language'. When I retyped everything into R myself -- instead of copying the code from the last post -- it worked just fine. The only trick now is that I am indeed getting the error indicated in a previous post where I get the same values for the upper and lower bounds. I guess I'll wait to hear about the fix in the next update. Thanks again for everyone's help.

Does anyone have a sense of when the update with the CI fix will be released? Thanks.

Not sure there is a fix for the mirror-CI's. I think that, as with mx, users will have to specify bounds in the model that stop the optimizer treating these solutions as valid.

Agreed. It would be very difficult for software to "perceive" such mirrors, so it would be better to rely on users' intelligent use to avoid obviously symmetric CIs. That being said, we the developers should try to specify models with boundary constraints built in so that symmetry problems would be avoided. Populating the example scripts and documentation accordingly would be a great start.

Thanks so much for your input. I was responding to a post from tbrick made earlier in this stream...

"t's worth noting that there's currently a bug in the estimation of confidence intervals that causes the calculation of some types of algebra CIs to fail. If you get the same value for the upper bound and lower bound, that's a result of the bug.

There's a fix that's currently in testing in the repository version of OpenMx, which should be in the next binary release."

Is this accurate? Thanks for the clarification.

algebra CIs should now be calculated correctly as of the OpenMx binary release version 0.4.1.

I'm still looking for a solution to calculating CIs for genetic and environmental correlations in R. The syntax I have received so far doesn't seem to work. Any help would be greatly appreciated. Many thanks in advance.

I think if you turn the correlation into an algebra, then you can get the CI on that named algebra

So

mxAlgebra(solve(sqrt(ACE.I*ACE.A)) %*% ACE.A %*% solve(sqrt(ACE.I*ACE.A)), name = "corComp_A")

mxAlgebra(solve(sqrt(ACE.I*ACE.E)) %*% ACE.E %*% solve(sqrt(ACE.I*ACE.E)), name = "corComp_E")

mxCI(c('ACE.corComp_A', 'ACE.corComp_E')

thanks for your input. here's what i tried:

mxAlgebra( expression = solve(sqrt(ACE.I*ACE.A)) %*% ACE.A %*% solve(sqrt(ACE.I*ACE.A)), name = "corComp_A")

mxAlgebra( expression = solve(sqrt(ACE.I*ACE.E)) %*% ACE.E %*% solve(sqrt(ACE.I*ACE.E)), name = "corComp_E")

CImodel <- mxModel(multAEFit, mxCI(c('ACE.corComp_A', 'ACE.corComp_E')))

CImodelfit<-mxRun(CImodel,intervals=T)

CIAEsumm<- summary(CImodelfit)

CIAEsumm

I tried it with and without the 'expression = ' part. it returned this error:

Error: unexpected input in "mxAlgebra( expression = solve(sqrt(ACE.I*ACE.A)) %*% ACE.A %*% solve(sqrt(ACE.I*ACE.A)), name = "corComp_A")‚"

> CImodel <- mxModel(multAEFit, mxCI(c('ACE.corComp_A', 'ACE.corComp_E')))

> CImodelfit<-mxRun(CImodel,intervals=T)

Running multAE

Error: Unknown reference to 'ACE.corComp_A' detected in a confidence interval specification in model 'multAE' in mxRun(CImodel, intervals = T)

> CIAEsumm<- summary(CImodelfit)

Error: object 'CImodelfit' not found

Error in summary(CImodelfit) :

error in evaluating the argument 'object' in selecting a method for function 'summary'

> CIAEsumm

Error: object 'CIAEsumm' not found

i'm not sure why it wouldn't like the input?? thanks again.

I can't find the confidence intervals in the output of OpenMx 3.1. There is however this in the output:

$confidenceIntervals

[,1] [,2]

$confidenceIntervalCodes

[,1] [,2]

I don't get any errors or warnings. But clearly, I am doing something wrong. I try to get the confidence intervals for the parameters in matrix S:

LGM <- mxModel(LGM, mA, mS, mF, mM, obj, ldata, mxCI(c('S')))

LGMe<-mxRun(LGM)

LGMe@output

I know it is only recently implemented, but could you give me some more informations on this?

so that models can be run without always computing the (time consuming) ML confidence intervals, Michael has wisely made this an opt-in option for mxRun().

So just change mxRun as follows:

LGMe<-mxRun(LGM, intervals=T)

PS: the stub for confidence intervals is always present in the summary, even when none are computed.

Thanks. It works. Latent Growth Model.S[1,1] 0.168991403 0.211585139

However, I get systematic warnings: In model 'Latent Growth Model' while computing the lower bound for 'Latent Growth Model.S[1,1]' NPSOL returned a non-zero status code 6. The model does not satisfy the first-order optimality conditions to the required accuracy, and no improved point for the merit function could be found during the final linesearch (Mx status RED).

I am running a simulation, and the estimates seem to be ok. The simulation does not concern a stress test, but simulates data of earlier fitted empirical models (with sufficient fit).

What can I do (besides suppressing the warnings) to improve the model, and to avoid these warnings?

Tim's comments are correct. Sorry, I forgot to mention adding the option to mxRun(). Also, the confidence intervals interface is not smart enough to distinguish between fixed parameters and free parameters in a matrix. So if matrix S is specified, then confidence intervals for all of the cells of S are calculated. In the next binary release, we should probably calculate confidence intervals for free cells. Although if S was an algebra, the feature set will not include calculation of which cells are free and which cells are fixed.

"... we should probably [only] calculate confidence intervals for free cells."

This is a high priority item. Trying to estimate confidence intervals for many types fixed parameters will introduce possibilities for interesting and ugly ways to obtain code 6s from the optimizer.

CI's for cells fixed to a constant value are easy. Just bung back the current value of the cell as the estimate, upper and lower CI. No optimization required.

It is important to be able to compute CI's for cells and matrices that are the result of algebras. I remember that being a bit of a headache to code in classic Mx, mainly due to the need, during optimization, to calculate algebras that would normally be evaluated only once at the end of optimization.

Just ran a three variable common path model, requesting CIs on ACE and the latent common factor matrix:

I just asked for mxNormalInterval(c("all.A", "all.C", "all.E", "all.L"), 0.95, 0.95)

where all is a supermodel containing the variance matrices.

This took 1959.748 secs (compared to 12.2 secs without CIs), both working on just one core of the processor.

Is the function smart enough not attempt to move parameters within requested matrices that cannot be altered (for instance where this would involve moving an off-diagonal element in a matrix or a fixed parameter)?

I vote for units in the range 0-1.

It would be nice if both upper and lower CIs defaulted to 0.95 so that

would be legal and do the expected thing (95% upper and lower intervals on the free parameters in the mxMatrix "A").

I also vote for 0-1. I see as .95 and 95% as equal; without units, I don't know what 95 means. Also, if we use 0-100, then .95 is a valid but unclear command. If we use 0-1, then 95 should throw a warning.

As currently written, were I to request lower and upper confidence limits of 95 and 95, I would get the 2.5th and 97.5th percentile, right? If I wanted something asymmetric, would I have to translate my lower and upper limits into the corresponding symmetric intervals? That is, If I care about the 10th to 95th percentiles, would my code be (80, 90)?

Any idea what a survey of the literature would reveal wrt whether CI's are reported in % or in fractional form. My impression is that (heh heh) less than 0.95% of them would publish or refer to intervals in fractional form, but then I don't read all the journals. Maintaining consistency with commonly used metrics can enhance user experience with the software.

At the same time, I see the point about trapping erroneous input. It seems to me extremely unlikely that anyone would ever want a 0.95% confidence interval. Even a 5% CI seems unlikely to be needed. So we could simply warn users that a requested 0.95% CI is actually very small, and perhaps they mean to request 95%? To make it bullet proof to avoid long computations to a useless end, one might even require a second argument (,stupidlySmallCI=TRUE) for any cases less than 10%.

I do like the idea of being able to request multiple CI's (80 & 90) in a list. In my experience, over 99% of users want 95% CIs, but to facilitate others might increase their popularity if they are useful.

I, of course, think percentage form is more common. However, we're not asking users to specify "upper=95%". Sans percent sign, standard notation is (and should be) a proportion. While my more sadistic side likes adding a "stupidlySmallCI" option, warnings that pop up whenever a criterion drops below .5 (0-1) or 50 (0-100) is likely handy. This avoids not only the .5?=50 issue, but keeps one from thinking upper=97.2, lower=2.5 means a 95% CI.

I vote for units in the range of 0-100. Why?

1. because I've never heard of a .95 confidence interval, and only ever heard of CI's in percentage terms, so that is a natural metric intuitive to the user.

2. Classic Mx used percentages as well so it will be natural to that user base

3. It is less typing - and fewer bytes of storage - to work with 95 than with .95

Ok, so 3 isn't a big deal, 2 probably will be to a subset of our target users, and 1 is just an accident of culture, but it seems very widespread usage to me.

Gee, and I've never heard of an alpha of 5. What does that mean?

Anyway, I think if you write 0.95, it is clear what you mean. It is a proportion. If you write 95, it is not clear. You need the percent sign to be clear. The percent sign divides by 100. But percent seems to have some other uses in R.

The only reason I can see for differing upper and lower intervals is to turn off either the lower or upper CI calculation. This might be useful if you are near a bound and the estimator is having trouble with the CI.

Exactly, what does an alpha of 5 mean? This is why, when requesting power calculations in classic Mx, one asks for Option power=.05 because fractional notation is practically universally used for this statistic. It is also why, in OpenMx, one asks for intervals in the % metric, such as 95, because this is the virtually universal metric for CIs. I've never had any complaints about using these metrics for input to Classic Mx. I expect some complaints if we change the historically popular matching to the natural language notation to something less intuitive in OpenMx.

Maybe replace the second interval parameter with an option to specify "both|lower|upper".

With defaults:

mxNormalInterval(reference, interval=0.95, side="both")

Ah, I had misunderstood that the two numbers were intended to get 80% to the left and 95% to the right. I've never heard of anyone wanting or publishing asymmetric regions like this. That is not to say that likelihood-based CI's cannot be asymmetric, it's actually a useful feature, but in that context it is that the distance from the MLE to the lower 95% CI value is not equal to the distance from the MLE to the upper 95% CI.

In the unlikely event that someone wanted to construct such an asymmetric region, it doesn't seem unreasonable for them to have to get both upper and lower CI's on both 80 and 95%, and manually paste the two bits together. A kindness that would preserve them from that would be to follow Tim's suggestion, and allow for side="upper" or side="lower" in addition.

Actually, sometimes estimating CI's can be difficult optimization problems, so that e.g., an upper CI doesn't work with particular optimization parameters (function precision and so on). Being able to request only upper or lower in these instances could be very handy.

Hi Steve: Is there an example people have of wanting different upper and lower CIs?

Of course the confidence estimates themselves itself may well be asymmetrical, but I've only seen the % interval being symmetric.

It sounds reasonable to me. Are the namespace for matrices and for parameter labels exclusive? So if there's a matrix x and a parameter labeled x, how does the function distinguish between them?

We don't allow a matrix 'x' and a parameter 'x'. That would make the following expression ambiguous: mxAlgebra(x + y + z, name = 'sum'). Does that algebra refer to the matrix 'x' or the parameter 'x'?

yes, that caught me the other day - I had a 1*1 matrix called 'a' with a labeled cell called 'a': That throws an mxRun() error.

It's a good idea of course, to prevent the user from doing this.

woot!

Seems nice that arg1 ("reference") can be a mixed list of matrix and algebra names and/or parameter labels.

I wonder if reference is the right name? I can't see a better noun, so fine with me.

Can't imagine anyone wanting asymmetric confidence levels, and can be done in two (parallel) runs if needed, and we'd avoid more errors by having just one parameter for CI.

When I typed this for myself, I used "95" rather than .95. But people will get used to either Maybe make 95 the default.

mxNormalInterval(reference, CI=95)