Combine calculations into the E-step (helps two-tier more)
If you want to explore the estimation options, such as comparing
speed with/without caching, this snapshot has everything working
and selectable at runtime. Since two-tier gains so much by not
caching and regular BA81 is not slower, the next change set will
remove the cache option and a bunch of code.