[R-sig-ME] glmmTMB: how to calculate posterior prob. of structural zero?

If you're fitting a zero-inflated Poisson model, it would be
especially easy because the zero probability is simply exp(-lambda), so
I believe the "posterior"(ish) probability that the zero is due to the
structural rather than conditional part of the model would be

P(zero|structural)/P(zero) =
P(zero|structural)/(P(zero_structural)+P(zero_conditional))

or

prob_struc <- function(model) {
strucprob <- predict(model, type="zprob")
condprob <- exp(-predict(model, type="conditional"))
return(strucprob/(strucprob+condprob))
}

For a negative binomial ("nbinom2", or quadratic parameterization) the
zero probability is (k/(k+mu))^k (I think ... e.g.
dnbinom(0,mu=0.5,size=0.5) is sqrt(0.5)). sigma(model) returns the
overdispersion parameter k, so a function as above but with

k <- sigma(model)
condprob <- (k/(k+predict(model,type="conditional")))^k

inserted should do the trick. (Please test these yourself and make
sure they are sensible before proceeding!)

Post by Aaron Mackey
I'm happily using glmmTMB to fit zero-inflated count models with my data,
but I'd like to also know which zeroes in my data are more likely (or not)
to be structural vs. expected from the conditional distribution. I know how
to use Bayes formula to calculate the posterior, and predict(zinb,
type="zprob") gives me the prior probabilites for each data point being
structural or not (respecting the zero inflation part of the model), and
the likelihoods for the structural components are 1 (if the data point is a
zero) or 0 (if the data point is not a zero) -- but is there a way to
extract the likelihood for each zero data point with respect to the
conditional part of the model?
thanks,
-Aaron
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Aaron Mackey

2018-10-05 20:07:15 UTC

Thanks for this, it seems to provide sensible numbers; but just to confirm,
does predict(model, type="conditional") actually use the Y variable
observed count? Or does it generate the expected log count from the
provided X variables (as in the case when newdata is provided)?

-Aaron

Post by Ben Bolker
If you're fitting a zero-inflated Poisson model, it would be
especially easy because the zero probability is simply exp(-lambda), so
I believe the "posterior"(ish) probability that the zero is due to the
structural rather than conditional part of the model would be
P(zero|structural)/P(zero) =
P(zero|structural)/(P(zero_structural)+P(zero_conditional))
or
prob_struc <- function(model) {
strucprob <- predict(model, type="zprob")
condprob <- exp(-predict(model, type="conditional"))
return(strucprob/(strucprob+condprob))
}
For a negative binomial ("nbinom2", or quadratic parameterization) the
zero probability is (k/(k+mu))^k (I think ... e.g.
dnbinom(0,mu=0.5,size=0.5) is sqrt(0.5)). sigma(model) returns the
overdispersion parameter k, so a function as above but with
k <- sigma(model)
condprob <- (k/(k+predict(model,type="conditional")))^k
inserted should do the trick. (Please test these yourself and make
sure they are sensible before proceeding!)

Post by Aaron Mackey
I'm happily using glmmTMB to fit zero-inflated count models with my data,
but I'd like to also know which zeroes in my data are more likely (or

not)

Post by Aaron Mackey
to be structural vs. expected from the conditional distribution. I know

how

Post by Aaron Mackey
to use Bayes formula to calculate the posterior, and predict(zinb,
type="zprob") gives me the prior probabilites for each data point being
structural or not (respecting the zero inflation part of the model), and
the likelihoods for the structural components are 1 (if the data point

is a

Post by Aaron Mackey
zero) or 0 (if the data point is not a zero) -- but is there a way to
extract the likelihood for each zero data point with respect to the
conditional part of the model?
thanks,
-Aaron
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

[[alternative HTML version deleted]]

Ben Bolker

2018-10-05 20:30:15 UTC

Thanks for this, it seems to provide sensible numbers; but just to confirm, does predict(model, type="conditional") actually use the Y variable observed count? Or does it generate the expected log count from the provided X variables (as in the case when newdata is provided)?

It generates the expected *count* (not the log count) *of the
conditional part of the model only* (i.e. not including structural
zeros), based on the estimated parameters and a model matrix (either
the one from the model or a newly generated one, if newdata is
specified)

-Aaron

_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Aaron Mackey

2018-10-06 00:29:53 UTC

Right -- so that's not what we need, is it? We need the probability of
having actually seen a zero, given the conditional model (regardless of
what the expected count might be). The probability of seeing a zero in the
other part of the mixture is 1.0, and the prior for both of the two models
is the mixture coefficient (and 1-coef). But we still need the likelihood
of the zero, given the conditional model, not the likelihood of seeing the
expected value.

-Aaron

Post by Aaron Mackey
Thanks for this, it seems to provide sensible numbers; but just to

confirm, does predict(model, type="conditional") actually use the Y
variable observed count? Or does it generate the expected log count from
the provided X variables (as in the case when newdata is provided)?
It generates the expected *count* (not the log count) *of the
conditional part of the model only* (i.e. not including structural
zeros), based on the estimated parameters and a model matrix (either
the one from the model or a newly generated one, if newdata is
specified)

Post by Aaron Mackey
-Aaron

Post by Aaron Mackey
I'm happily using glmmTMB to fit zero-inflated count models with my

data,

Post by Aaron Mackey
but I'd like to also know which zeroes in my data are more likely (or

not)

Post by Aaron Mackey
to be structural vs. expected from the conditional distribution. I

know how

Post by Aaron Mackey
to use Bayes formula to calculate the posterior, and predict(zinb,
type="zprob") gives me the prior probabilites for each data point

being

Post by Aaron Mackey
structural or not (respecting the zero inflation part of the model),

and

Post by Aaron Mackey
the likelihoods for the structural components are 1 (if the data

point is a

_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

[[alternative HTML version deleted]]

Aaron Mackey

2018-10-06 00:33:31 UTC

Nevermind, I think I see my mistake -- the conditional expectation gives
the new mean, and with the dispersion you get a P value for having seen
zero, given the mean and dispersion. Sorry to be daft. If I wanted the
probability of some value other than zero, then the formula would contain
that actual value. thanks again! -Aaron

Post by Aaron Mackey
Right -- so that's not what we need, is it? We need the probability of
having actually seen a zero, given the conditional model (regardless of
what the expected count might be). The probability of seeing a zero in the
other part of the mixture is 1.0, and the prior for both of the two models
is the mixture coefficient (and 1-coef). But we still need the likelihood
of the zero, given the conditional model, not the likelihood of seeing the
expected value.
-Aaron

Post by Aaron Mackey
Thanks for this, it seems to provide sensible numbers; but just to

Post by Aaron Mackey
-Aaron

Post by Aaron Mackey
I'm happily using glmmTMB to fit zero-inflated count models with my

data,

Post by Aaron Mackey
but I'd like to also know which zeroes in my data are more likely

(or not)

Post by Aaron Mackey
to be structural vs. expected from the conditional distribution. I

know how

Post by Aaron Mackey
to use Bayes formula to calculate the posterior, and predict(zinb,
type="zprob") gives me the prior probabilites for each data point

being

Post by Aaron Mackey
structural or not (respecting the zero inflation part of the model),

and

Post by Aaron Mackey
the likelihoods for the structural components are 1 (if the data

point is a