ML (or bootstrapped) c.i.'s run in parallel

11 replies [Last post]
svrieze's picture
Offline
Joined: 01/03/2010

I just thought I'd suggest again the usefulness of a simple user option to obtain c.i.'s using multiple cores (like through snowfall). Could be bootstrapped, could be ML.

I'm routinely it seems running into problems with many parameters. Ideally I'd want c.i.'s on all of them, but computation proceeds too slowly for this to be feasible.

I'm sure you get this often, but OpenMx is fabulous. Longtime R and S users are most grateful.

-Scott

svrieze's picture
Offline
Joined: 01/03/2010
mxCI issue

Hi Mike,

I'm working in a behavior genetic model and computing confidence intervals on parameters in cholesky decomps. Somewhat routinely the intervals were symmetric about zero, even though the parameter should have been highly significant. For example, the parameter estimate was .55 with 95% c.i. = [-.69, .69]. I'm not sure how to fix a bug like this --- the sign is unimportant for this parameter (as long as the signs on other relevant parameters within the cholesky decomposition accomodate the sign of the parameter of interest) and one would obtain the same likelihood with .55 or -.55. Once confidence intervals are computed the sign becomes crucial, however.

-Scott

mspiegel's picture
Offline
Joined: 07/31/2009
To double-check, you have

To double-check, you have specified lower bounds of 0.0 for those free parameters?

svrieze's picture
Offline
Joined: 01/03/2010
no, I haven't.

no, I haven't.

tbrick's picture
Offline
Joined: 07/31/2009
Try setting the lbound to 0.0

Try setting the lbound to 0.0 for that free parameter.

In cases where the positive and negative versions of the solution are both valid, it's fairly common in the CI calculation to jump the gap around zero and report the bounds as the upper bound of the positive solution and the lower bound of the negative solution. That's a side effect of calculating likelihood-based confidence intervals. The result is that you get symmetric confidence bounds around zero where you'd expect something else.

Since the sign doesn't matter, you can pick one (usually positive) and force it into that solution. You can do that by adding lbound=0.0 to the mxMatrix() or mxPath() functions that create the free parameters. That keeps the optimization from dropping into the area of the negative solution.

In many cases (cholesky models included), it's not necessary to put bounds on all the parameters; you just need enough to force it into either the positive solution or the negative one. And in any case where the relative sign of two parameters matters, you only want/need to constrain one of them--the other will be constrained by the model and the data. I think the general recommendation, though, is to put a lower bound of 0.0 on any parameter whose sign doesn't matter at all.

With a lower bound at 0.0, OpenMx will return the (strictly positive) confidence bounds for the free parameter. In the case that the confidence interval includes zero, you should get zero back as your lower confidence bound.

Steve's picture
Offline
Joined: 07/30/2009
You are going to love version

You are going to love version 1.1.

The functionality is on its way, not only for multiple cores, but also for clusters and for heterogeneous groups of computers. Suppose you work in an area that has 10 Macs and/or Linux machines that aren't being used at night. You will be able to connect them all into one virtual cluster and compute things like bootstraps, CIs, and simulations as one large virtual job, using every core on every machine. This is in testing now and we hope to have a beta out soon.

In the mean time, if you are willing to compile from the svn repository, there is a working multicore parallel ML CI function. Here's a code snip from Michael Spiegel illustrating how to calculate ML CIs for all parameters in a RAM model using the current version in svn:

If the old workflow ends with:

modelOut <- mxRun(model)

then the new workflow ends with:

model <- mxModel(model, mxCI('A'), mxCI('S'))
modelOut <- mxRun(model)
modelCI <- omxParallelCI(modelOut)

All CIs for all free parameters are calculated using all available cores.

The only caveat here is that since this in newly implemented, the syntax may change as we continue to test it. If you have syntax suggestions we certainly welcome them!

tbates's picture
Offline
Joined: 07/31/2009
My suggestion is that

My suggestion is that omxParallelCI() gets called transparently by mxRun when intervals = T and swift or snowfall are initialised.

FYI: this is a nice article on the state of the play in computing: China has pole position for the minute, based on an Intel/NVIDIA Tesla GPU solution.

Our local system has 4.147 TFLOPS worth of Tesla GPU - Be great to bring them into play: matrices melting like pixels :-)

Ars technica article

mspiegel's picture
Offline
Joined: 07/31/2009
Yes omxParallelCI() should be

Yes omxParallelCI() should be merged into mxRun(). But we haven't written very many tests that make use of independent submodels. I fixed some bugs that showed up as I was writing omxParallelCI(). But I would like the parallel CIs to be tested some more before they are moved into the default case.

svrieze's picture
Offline
Joined: 01/03/2010
omxParallelCI()

I take it that omxParallelCI() is still the function to use? mxRun(model, intervals=TRUE) doesn't seem to use more than 1 cpu, whereas omxParallelCI() appears to work well.

Did anything ever happen with the GPU idea?

-Scott

svrieze's picture
Offline
Joined: 01/03/2010
It's me again

I see some references to "swift" software for parallel processing. Are there other resources for using OpenMx on a cluster with PBS batch queuing system? I see that SNOW might work?

-Scott

mspiegel's picture
Offline
Joined: 07/31/2009
Swift will work in a PBS

Swift will work in a PBS cluster environment. See http://www.ci.uchicago.edu/wiki/bin/view/SWFT/SwiftR. Snowfall will also work. Below is a copy/paste of directions I have used in the past to run snowfall in a PBS environment. Note you will need to have the snowfall ports open on the cluster. I don't know which ports are needed, I happen to have access to a cluster with all ports open on the compute nodes.

...After messing around with some alternatives, I found the simplest solution was to use the sockets interface in snowfall. Provided that processes are allowed to open arbitrary ports on cluster nodes, it is easy to setup.

There is one slight catch. The R interpreter has a maximum number of open sockets as a compile-time constant. The (max # of open sockets) == (max degree of parallelism) using the sockets interface. This number has been increased in subsequent R versions, as of R 2.12.0 the number is 128. For me, it was no trouble to grab a copy of the R source, change this number to 1024, and recompile. In the file src/main/connections.c it is #defined as NCONNECTIONS.

Here is all that is needed in R to use the sockets interface when
running a PBS script:

library(OpenMx)
library(snowfall)
hostfile <- Sys.getenv(c("PBS_NODEFILE"))
hostnames <- read.table(hostfile, stringsAsFactors=FALSE)[[1]]
ncpus <- length(hostnames)
sfSetMaxCPUs(ncpus)
sfInit(parallel=TRUE, cpus=ncpus, socketHosts = hostnames)
sfLibrary(OpenMx)

And the PBS directive is:

#PBS -l select=X:mpiprocs=Y:ncpus=Y

In order to use X machines, where each machine allocates Y processes, for a total of X * Y cores.

svrieze's picture
Offline
Joined: 01/03/2010
Worked great!

Thanks Mike! That's really going to speed things up.

FYI, I had to tweak the PBS directive just a bit for my system. I'll paste it here in case others are using a PBS system.

#PBS -l nodes=10:ppn=12,mem=12GB,walltime=5:00:00

Where there are 10 nodes, 12 processors per node, I need 12GB memory, and 5 horus of walltime.