## Poisson Counts

5 replies [Last post]
Offline
Joined: 01/14/2010

I've asked about this before but I don't think I ever received any replies. Can OpenMx handle Poisson counts as observed variables? I have counts that are too small to treat as continuous Normally distributed outcomes and I need a way to model them. Any insights would be greatly appreciated.

Offline
Joined: 01/24/2014
Some things germane to the thread topic

Currently, we plan to implement built-in support for count distributions in a future version of OpenMx, made available via the UI for IFA models.

There is a demo in the OpenMx test suite that shows how to do a generalized estimating equations (GEE) analysis of repeated measures on an overdispersed count variable. It includes a loop to calculate robust standard errors from the "sandwich estimator" of the regression coefficients' repeated-sampling covariance matrix. However, the specification in OpenMx is rather cumbersome, and it will usually be easier to do GEE in software designed specifically for it (e.g., the R package 'gee').

Also, I have a recent paper on the use of multivariate discrete distributions to estimate biometric variance components from overdispersed count data collected in a twin study.

salarim1 (not verified)
poisson gmm

hi
I want to add poisson likelihood function in growth mixture model for dependent variable.I need to model count variable.
Any insights would be greatly appreciated.

Offline
Joined: 04/19/2011
Yes, but it's not built-in

Yes, OpenMx can utilize the Poisson likelihood function, but it is not built-in. You will have to set up your model using an algebra objective or an R objective. See my examples below. Note that if your count variable takes relatively few unique values in your sample, you could consider ordinalizing it and doing the analysis as an ordinal thresholds model instead.

Let's first simulate some data for some simple examples:

```dat <- cbind( rpois(1000,1), 1, rnorm(1000) )
dimnames(dat) <- list(NULL,c("y","x0","x1"))```

If we want to do a simple Poisson regression, setting up the model to use a row-algebra objective would look something like this:

```PoisAlgModel <- mxModel(
"PoissonAlgebra",
mxData(observed=dat,type="raw"),
mxMatrix(type="Full",nrow=2,ncol=1,free=T,values=c(1,0),labels=c("b0","b1"),name="b"),
mxMatrix(type="Full",nrow=1,ncol=2,free=F,labels=c("data.x0","data.x1"),name="xt"),
mxAlgebra( exp(xt%*%b), name="yhat", dimnames=list(NULL,"y")),
mxAlgebra( -2*filteredDataRow*log(yhat) + 2*yhat, name="rowAlgebra" ),
mxAlgebra( sum(rowResults), name="reduceAlgebra" ),
mxRowObjective(rowAlgebra="rowAlgebra",reduceAlgebra="reduceAlgebra",dimnames=c("y"))
)
PoisAlgFit <- mxRun(PoisAlgModel)```

If we want to do the same analysis, but with an R objective, the code would look something like this:

```Poisobj <- function(model,state){
modeldata <- model\$data@observed
b0 <- model\$b@values[1,1]
b1 <- model\$b@values[2,1]
return(-2*sum(apply(modeldata,1,function(row){dpois(x=row[1],lambda=exp(b0*row[2] + b1*row[3]),log=T)})))
}
PoisRModel <- mxModel(
"PoissonAlgebra",
mxData(observed=dat,type="raw"),
mxMatrix(type="Full",nrow=2,ncol=1,free=T,values=c(1,0),labels=c("b0","b1"),name="b"),
mxRObjective(Poisobj)
)
PoisRFit <- mxRun(PoisRModel)```

Note that the two models' likelihood functions are not equal, but merely proportional. So, the parameter estimates should be the same (or very very close), but the value of the -2loglikelihood won't be. It sounds as though you have a longitudinal mixture analysis in mind, so naturally, your objective function will be way more complicated than the ones I demonstrated here, but I hope I've given enough to get you started. I might be able to help you more if you say more about the analysis you want to do.

Finally, I feel I certainly should warn you about the problem of overdispersion, in case you don't know. If your count variable's variance appreciably exceeds its mean, even after conditioning on covariates, you should not trust the standard errors, confidence intervals, and likelihood-ratio tests from models using the classic Poisson likelihood. You would need to adjust those inferential results appropriately, or use a different objective function. The point estimates of your regression coefficients won't be biased by the overdispersion, though.

Offline
Joined: 11/08/2014

I have poisson count data that is zero-inflated. I was wondering if OpenMx can handle that as well?

Thanks!

Offline
Joined: 04/19/2011
Sure.

No reason why not. The zero-inflated Poisson distribution simply has a slightly different PMF. Zero-inflated Poisson is a mixture distribution, but since it's a discrete mixture, it's pretty easy to work with.

Let's first simulate some data for some simple examples:

```u <- runif(1000,0,1)
v <- rpois(1000,1)
y <- ifelse(u<=0.4,0,v)
dat <- cbind( y, 1, rnorm(1000) )
dimnames(dat) <- list(NULL,c("y","x0","x1"))```

In the upcoming OpenMx 2.0, it will be practical to do this analysis with a row algebra fitfunction. I'll assume you're still using 1.3.2/1.4, so setting up the model with an R objective would look like this:

```ZIPoisobj <- function(model,state){
modeldata <- model\$data@observed
b0 <- model\$b@values[1,1]
b1 <- model\$b@values[2,1]
gam0 <- model\$gamma@values[1,1]
gam1 <- model\$gamma@values[2,1]
lambda <- exp(b0*modeldata[,2] + b1*modeldata[,3])
w <- exp(gam0*modeldata[,2] + gam1*modeldata[,3]) / (1+exp(gam0*modeldata[,2] + gam1*modeldata[,3]))
iszero <- modeldata[,1]==0
return(-2*sum(log(iszero*w + (1-w)*dpois(modeldata[,1],lambda))))
}
ZIPoisModel <- mxModel(
"ZIPois",
mxData(observed=dat,type="raw"),
mxMatrix(type="Full",nrow=2,ncol=1,free=T,values=c(1,0),labels=c("b0","b1"),name="b"),
mxMatrix(type="Full",nrow=2,ncol=1,free=T,values=c(1,0),labels=c("gam0","gam1"),name="gamma"),
mxRObjective(ZIPoisobj)
)
ZIPoisRun <- mxRun(ZIPoisModel)```

Note that the distribution has two parameters: the parameter of the Poisson mixture component (lambda) and the zero-inflation proportion (w). Both can be conditioned upon covariates, as I've done here, using a logit link for the zero-inflation proportion. In general, you can use different sets of covariates for the two parameters.

Edit: I might be able to be more helpful if you mentioned a few more details about your dataset and the analysis you're trying to do.