Another excess memory usage problem - sudden spike when using csolnp

22 replies [Last post]
CharlesD's picture
Offline
Joined: 04/30/2013

CSOLNP is working quite nicely in general, but I have a few circumstances where things go dramatically wrong and crash my machine (hard reboot needed) due to excess memory usage if I don't notice fast enough. This doesn't occur when using npsol. With the earlier memory usage issues ( http://openmx.psyc.virginia.edu/thread/2551 ) memory usage increased gradually, but in this case it seems much more sudden.

A problem model can be downloaded from:
https://www.dropbox.com/s/kgpsualdlnaowhs/memprobmodel.RData?dl=0

test <- mxRun(memprobmodel, intervals=T)

edit: I don't know specifically what causes the issue, but I'm making extensive use of algebra, exponential functions, and definition variables.

editedit: problem *does* still exist even with latest updates (26-8-2014). so far I only experience it when calculating confidence intervals. With the above model, after a few minutes of fitting with memory usage at a couple of hundred mb, it suddenly starts going up very rapidly. The problem occurs on more than 1 pc.

CharlesD's picture
Offline
Joined: 04/30/2013
Ok, right, on my machine with

Ok, right, on my machine with more memory the above model also fits, but memory usage does still spike to 6gb or so, which illustrates what seems (to me) to be the problem (or potential improvement), as npsol fits with a steady 100mb or so. Does memory usage not start going up rapidly after a few minutes for you two? I'm surprised it fits on 32 bit windows actually, I would have thought it would definitely hit memory problems. I've been trying to generate a more problematic example but can't at the moment, if I get one that either memory spikes faster, or higher, I'll post it.

> mxVersion()
OpenMx version: 2.0.0.0
R version: R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32
Default optimiser: CSOLNP

This is with commit 9ce8fba on the master branch, on windows 7 and windows 8 pc's.

RobK's picture
Offline
Joined: 04/19/2011
Compiler?

Charles, I take it you are building OpenMx from source on your machine, correct? Which compiler are you using? Do you use the Rtools toolchain?

CharlesD's picture
Offline
Joined: 04/30/2013
Yes, building from source,

Yes, building from source, using rtools.

mhunter's picture
Online
Joined: 07/31/2009
Replication on 64-bit

I'm getting the same behavior on Windows 7 64-bit machine running R 3.1 64-bit on the OpenMx Binary. It looks like when confidence intervals start, memory usage quickly linearly increases to 100% RAM. Interestingly, the same machine running the same OpenMx on 32-bit R shows no problem.

RobK's picture
Offline
Joined: 04/19/2011
Specific to 64-bit Windows?

I should have tried to reproduce the problem on the 64-bit Windows machine in my office last week before I left for the long weekend... Anyhow, I just ran Charles' memprobmodel2, with intervals=T, and R's memory usage began to climb ceaselessly, as he described. So, it appears to be something specific to confidence intervals, with CSOLNP, under 64-bit Windows.

FWIW:

> mxVersion()
OpenMx version: 2.0.0.3751
R version: R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32
Default optimiser: CSOLNP

neale's picture
Offline
Joined: 07/31/2009
Strange version number

Charles

I strongly suspect that this is a bug that has already been fixed, and that you are using an outdated version of the Beta. Your version number looks odd, it does not include a build number on the end, like this: 2.0.0.3766

When you say commit 9ce8fba I am confused (though others on the dev team may not be). Were you building from source? The svn tree is currently at version 3776.

Cheers
Mike

jpritikin's picture
Offline
Joined: 05/23/2012
git version number

Yeah, commit 9ce8fba is a GIT version number. We still use SVN as the definitive source code repository so we need a SVN build number.

CharlesD's picture
Offline
Joined: 04/30/2013
I was also surprised at the

I was also surprised at the version number thing... I have rstudio setup with a project linked via git to the gitorious openmx (which is where I got the commit reference from), and build by telling rstudio to build (after specifying additional 'install' argument to the make command). This has worked ok in the past for getting updates, I can see the recent source code and see a recent change to default summary output wherein the optimizer is reported.

neale's picture
Offline
Joined: 07/31/2009
Looks like a github thing

If you could build from the svn repository version, per http://openmx.psyc.virginia.edu/wiki/howto-build-openmx-source-repository then I think the problem will go away. And you'll get a sensible version number.

Cheers
Mike

CharlesD's picture
Offline
Joined: 04/30/2013
No change in behaviour, model

No change in behaviour, model still goes to 6gb of memory...

OpenMx version: 2.0.0.3777
R version: R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32
Default optimiser: CSOLNP

Does anybody know if / how I can impose a lower memory limit on 64 bit windows R? memory.limit doesn't want to let me decrease it. If I could do this I assume I would avoid the hard reboots on windows 7 at least (my windows 8 machine has nicer behaviour in this instance - instead of the machine bogging down to the point that I can't kill the task, it just pops a msg box complaining about mem usage).

jpritikin's picture
Offline
Joined: 05/23/2012
memory limit

I'm not sure how to impose a memory limit in Windows, but you'll need to impose a limit on application memory as a whole. OpenMx does not use R's memory in many cases so an R limit is not going to have much of an effect.

jpritikin's picture
Offline
Joined: 05/23/2012
seems fixed?

Thanks for your report.

CharlesD's picture
Offline
Joined: 04/30/2013
Sorry for the confusion, I've

Sorry for the confusion, I've edited the top post to reflect my current understanding, the problem still persists, I just only note it when calculating confidence intervals.

neale's picture
Offline
Joined: 07/31/2009
Please run this

mxVersion()

and paste the output into a reply. We are having much difficulty reproducing the error you report, and want to make sure that we are using exactly the same version.

neale's picture
Offline
Joined: 07/31/2009
Windows specific?

I could not reproduce this fault with OpenMx from SVN 3766, when run on a Mac Pro. No sign of excessive RAM usage (machine has 64G but reported 53G free throughout). For the record, here's the output I got with CSOLNP:

> summary(memprobRun)
Summary of ctsem 
 
free parameters:
           name            matrix row col      Estimate    Std.Error lbound ubound
1       drift11     discreteDRIFT   1   1  0.9892736976 8.071140e-05             1
2       drift21     discreteDRIFT   2   1  0.0550340030           NA              
3       drift12     discreteDRIFT   1   2 -0.0532963880           NA              
4       drift22     discreteDRIFT   2   2 -0.0029642474 4.789469e-06             1
5   diffusion11 discreteDIFFUSION   1   1  0.1334714775 8.688719e-03      0       
6   diffusion21 discreteDIFFUSION   2   1  0.0063801653 1.020747e-02              
7   diffusion22 discreteDIFFUSION   2   2  0.2673616861 2.224372e-02      0       
8         cint1      discreteCINT   1   1  0.4267176984 1.185061e-02              
9         cint2      discreteCINT   1   2  3.4857018784 2.580032e-02              
10      T1var11         withinphi   1   1  3.8303294374 1.511289e+00      0       
11      T1var21         withinphi   2   1 -0.1442955883 5.484717e-01              
12      T1var22         withinphi   4   1  0.5387750154           NA      0       
13     T1meanV1           T1MEANS   1   1 17.6330892849 2.204301e-01              
14     T1meanV2           T1MEANS   2   1  4.5930297322 7.128458e-02              
15   traitvar11  discreteTRAITVAR   1   1  0.0005614714 1.897635e-03      0       
16   traitvar21  discreteTRAITVAR   2   1  0.0147575449 3.549271e-03              
17   traitvar22  discreteTRAITVAR   2   2  0.0083684409 1.051363e-02      0       
18 T1traitcov11        T1TRAITCOV   1   1  0.0622806450 1.247996e-01              
19 T1traitcov21        T1TRAITCOV   2   1 -0.4751025866 4.229783e-01              
20 T1traitcov12        T1TRAITCOV   1   2 -0.1015906162           NA              
21 T1traitcov22        T1TRAITCOV   2   2 -0.8484358084           NA              
 
confidence intervals:
                       lbound    estimate      ubound note
ctsem.DRIFT[1,1]   0.02070937   0.0289038  0.02084729  !!!
ctsem.DRIFT[2,1]   0.66084469   0.7923929  0.70102875  !!!
ctsem.DRIFT[1,2]  -0.81542200  -0.7673743 -0.64010226     
ctsem.DRIFT[2,2] -14.30917421 -14.2575784 -1.96650516     
 
observed statistics:  1200 
estimated parameters:  21 
degrees of freedom:  1179 
-2 log likelihood:  1912.234 
number of observations:  100 
Information Criteria: 
      |  df Penalty  |  Parameters Penalty  |  Sample-Size Adjusted
AIC:      -445.7662               1954.234                       NA
BIC:     -3517.2619               2008.942                 1942.619
Some of your fit indices are missing.
  To get them, fit saturated and independence models, and include them with
  summary(yourModel, SaturatedLikelihood=..., IndependenceLikelihood=...). 
timestamp: 2014-08-26 13:29:15 
wall clock time: 147.219 secs 
OpenMx version number: 2.0.0.3766 
Need help?  See help(mxSummary) 

And with NPSOL (which finds a lower minimum, unusual instance of better performance with NPSOL than CSOLNP):

> memprobRun <- mxRun(memprobmodel2, intervals=T)
Running ctsem 
> summary(memprobRun)
Summary of ctsem 
 
free parameters:
           name            matrix row col    Estimate   Std.Error lbound ubound
1       drift11     discreteDRIFT   1   1  0.48053361 0.058810931             1
2       drift21     discreteDRIFT   2   1  0.07088314 0.067632864              
3       drift12     discreteDRIFT   1   2  0.02174985 0.040749825              
4       drift22     discreteDRIFT   2   2  0.58355130 0.060506367             1
5   diffusion11 discreteDIFFUSION   1   1  0.10607987 0.007501988      0       
6   diffusion21 discreteDIFFUSION   2   1  0.01775367 0.007384163              
7   diffusion22 discreteDIFFUSION   2   2  0.20119085 0.014252114      0       
8         cint1      discreteCINT   1   1  9.14676984 1.056070080              
9         cint2      discreteCINT   1   2  0.60300918 1.216352717              
10      T1var11         withinphi   1   1  2.84693038 0.402707390      0       
11      T1var21         withinphi   2   1  0.10749578 0.077824140              
12      T1var22         withinphi   4   1  0.20862826 0.029504750      0       
13     T1meanV1           T1MEANS   1   1 17.70547447 0.168728456              
14     T1meanV2           T1MEANS   2   1  4.50301285 0.045675842              
15   traitvar11  discreteTRAITVAR   1   1  0.72703940 0.196624735      0       
16   traitvar21  discreteTRAITVAR   2   1 -0.08178721 0.096569456              
17   traitvar22  discreteTRAITVAR   2   2  0.00000000 0.021245529     0*       
18 T1traitcov11        T1TRAITCOV   1   1  1.96383353 0.435687274              
19 T1traitcov21        T1TRAITCOV   2   1 -0.31360423 0.347418356              
20 T1traitcov12        T1TRAITCOV   1   2  0.07033241 0.057403361              
21 T1traitcov22        T1TRAITCOV   2   2 -0.01570925 0.017056963              
 
confidence intervals:
                      lbound    estimate     ubound note
ctsem.DRIFT[1,1] -1.02186200 -0.73579299 -0.5211748     
ctsem.DRIFT[2,1]  0.05265773  0.13389238  0.3880235     
ctsem.DRIFT[1,2] -0.10635909  0.04108366  0.1966068     
ctsem.DRIFT[2,2] -0.76725750 -0.54120110 -0.3561074     
 
observed statistics:  1200 
estimated parameters:  21 
degrees of freedom:  1179 
-2 log likelihood:  1673.847 
number of observations:  100 
Information Criteria: 
      |  df Penalty  |  Parameters Penalty  |  Sample-Size Adjusted
AIC:      -684.1532               1715.847                       NA
BIC:     -3755.6488               1770.555                 1704.232
Some of your fit indices are missing.
  To get them, fit saturated and independence models, and include them with
  summary(yourModel, SaturatedLikelihood=..., IndependenceLikelihood=...). 
timestamp: 2014-08-26 12:25:25 
wall clock time: 361.053 secs 
OpenMx version number: 2.0.0.3766 
Need help?  See help(mxSummary) 

Re-running CSOLNP improves the solution somewhat but it gets stuck again with -2 log likelihood: 1890.516 and no improvement was obtained from a third run. It was quite happy to stick with the estimated parameters from NPSOL though, and return standard errors without NA's:

> params <- omxGetParameters(memprobRunNPSOL)
> params
     drift11      drift21      drift12      drift22  diffusion11  diffusion21  diffusion22 
  0.48053361   0.07088314   0.02174985   0.58355130   0.10607987   0.01775367   0.20119085 
       cint1        cint2      T1var11      T1var21      T1var22     T1meanV1     T1meanV2 
  9.14676984   0.60300918   2.84693038   0.10749578   0.20862826  17.70547447   4.50301285 
  traitvar11   traitvar21   traitvar22 T1traitcov11 T1traitcov21 T1traitcov12 T1traitcov22 
  0.72703940  -0.08178721   0.00000000   1.96383353  -0.31360423   0.07033241  -0.01570925 
> npsolution <- omxSetParameters(memprobmodel2,labels=names(params),values=params)
> mxOption(NULL, "Default optimizer", "CSOLNP")
> memprobRunCSOLNP <- mxRun(npsolution,intervals=T)
 
> summary(memprobRunCSOLNP)
Summary of ctsem 
 
free parameters:
           name            matrix row col      Estimate   Std.Error lbound ubound
1       drift11     discreteDRIFT   1   1  4.804554e-01 0.058832219             1
2       drift21     discreteDRIFT   2   1  7.086569e-02 0.067959088              
3       drift12     discreteDRIFT   1   2  2.174293e-02 0.040773525              
4       drift22     discreteDRIFT   2   2  5.835050e-01 0.060510100             1
5   diffusion11 discreteDIFFUSION   1   1  1.060794e-01 0.007501939      0       
6   diffusion21 discreteDIFFUSION   2   1  1.775283e-02 0.007383953              
7   diffusion22 discreteDIFFUSION   2   2  2.011832e-01 0.014251074      0       
8         cint1      discreteCINT   1   1  9.148188e+00 1.056562609              
9         cint2      discreteCINT   1   2  6.035270e-01 1.222148807              
10      T1var11         withinphi   1   1  2.847043e+00 0.402804371      0       
11      T1var21         withinphi   2   1  1.075018e-01 0.077831659              
12      T1var22         withinphi   4   1  2.086286e-01 0.029505046      0       
13     T1meanV1           T1MEANS   1   1  1.770547e+01 0.168731787              
14     T1meanV2           T1MEANS   2   1  4.503013e+00 0.045675890              
15   traitvar11  discreteTRAITVAR   1   1  7.272923e-01 0.196782445      0       
16   traitvar21  discreteTRAITVAR   2   1 -8.178111e-02 0.097040229              
17   traitvar22  discreteTRAITVAR   2   2  3.552471e-14 0.021339816     0*       
18 T1traitcov11        T1TRAITCOV   1   1  1.964354e+00 0.436023851              
19 T1traitcov21        T1TRAITCOV   2   1 -3.135785e-01 0.349096086              
20 T1traitcov12        T1TRAITCOV   1   2  7.035303e-02 0.057427140              
21 T1traitcov22        T1TRAITCOV   2   2 -1.570743e-02 0.017101853              
 
confidence intervals:
                     lbound    estimate     ubound note
ctsem.DRIFT[1,1] -1.0155040 -0.73595494 -0.6076324     
ctsem.DRIFT[2,1]  0.0526557  0.13387535  0.1379083     
ctsem.DRIFT[1,2] -0.1025269  0.04107548  0.1966052     
ctsem.DRIFT[2,2] -0.7661326 -0.54127946 -0.3580459     
 
observed statistics:  1200 
estimated parameters:  21 
degrees of freedom:  1179 
-2 log likelihood:  1673.847 
number of observations:  100 
Information Criteria: 
      |  df Penalty  |  Parameters Penalty  |  Sample-Size Adjusted
AIC:      -684.1532               1715.847                       NA
BIC:     -3755.6488               1770.555                 1704.232
Some of your fit indices are missing.
  To get them, fit saturated and independence models, and include them with
  summary(yourModel, SaturatedLikelihood=..., IndependenceLikelihood=...). 
timestamp: 2014-08-26 13:50:29 
wall clock time: 109.1414 secs 
OpenMx version number: 2.0.0.3766 
Need help?  See help(mxSummary) 

CharlesD's picture
Offline
Joined: 04/30/2013
Yes, I seem to encounter

Yes, I seem to encounter quite a lot of cases of starting value sensitivity with more complex continuous time models... making me think perhaps a bayesian approach would work better, but I'd love to hear any other suggestions or thoughts for dealing with the issue.

RobK's picture
Offline
Joined: 04/19/2011
I cannot reproduce it either

I cannot reproduce it with CSOLNP either, on 32-bit Windows.

Edit: With revision 3751.

mhunter's picture
Online
Joined: 07/31/2009
When using R 3.1.0 32-bit on

When using R 3.1.0 32-bit on Windows with the OpenMx Beta Binary, I don't get any huge memory usage. Running R and various background processes I'm using 2.24 GB of RAM. Running the example model with intervals=TRUE, it hangs around 2.25 GB for a while and eventually (probably when doing the intervals) it slowly climbs to 2.45 GB. On return after the model is done, everything goes back down to around 2.25 GB. This corresponds to a percent use between 27% and 30%. Nothing out of the ordinary to me. It sounds like I'm not replicating this problem.

CharlesD's picture
Offline
Joined: 04/30/2013
Ok. I also don't get the

Ok. I also don't get the issue with 32 bit R, memory usage remains very low. When I switch back to 64bit, I use all the spare physical memory on my laptop (6gb) and windows 'commits' 16gb of virtual memory to the process (I'm not clear on what that commitment actually means - is it using it or just prepared to use it in some way? This is according to the windows 8 resource monitor)

But, now I'm embarrassed... in the example I posted the confidence intervals are set to an algebra. When I correctly set them to the 'discreteDRIFT' matrix rather than the 'DRIFT' algebra (confusion arose because I've been switching between different parameter sets to work out which optimizes best), things work fine. I'll be surprised, but I won't say it's impossible, if this was the problem in the other cases. I'm impressed that confidence intervals estimate on an algebra in the first place - is that intended?

neale's picture
Offline
Joined: 07/31/2009
Confidence Intervals on Algebra

Yes, that is a fully intended feature which has been present in classic Mx since 1995 and was designed into OpenMx from its earliest days.

I do hope that the memory issues are solved. Running the problem with Valgrind did not reveal any memory leaks. We really appreciate your input - keep the comments coming!

CharlesD's picture
Offline
Joined: 04/30/2013
Ok, just confirming that the

Ok, just confirming that the issue does happen when I set confidence intervals on a free parameter, as I normally would... no example as I didn't catch it before the pc froze. I'll go to 32 bit R for the time being.

RobK's picture
Offline
Joined: 04/19/2011
I'm working on this

On Friday, I was running your memory-problematic model on a 64-bit Windows machine, under a debugger. When I compile without enabling multithreading, I notice that it doesn't memory-hog, but it does hang indefinitely. I'm trying to figure out where it gets stuck.

EDIT: Actually, I can tell from checkpointing that it's not hanging. It's just running a lot more slowly in debug mode than I thought. I also managed to trigger the memory leak on my 32-bit machine by running Charles' model repeatedly with mxTryHard() (in build from trunk).