Clark Kogan
2018-10-11 21:26:31 UTC
I am trying to get a sense as to whether there is a standard accepted
method for producing estimates of the probability averaged over yet
unobserved individuals along with confidence intervals on the average
probability for mixed effects logistic regression.
The basic question is this: for a particular set of covariates, what is the
average probability that people will choose the response y = 1, and how
confident are we in this average.
I have been using the following method:
For the estimate of the average probability, I predict the probability of
y=1 for each individual in the data, and then average the probability over
these individuals. For confidence intervals, I use the non-parametric
bootstrap percentile method (bootstrapping individuals and using the
previous method to estimate the average probability). This typically takes
a while to finish, which is ok, though if there were a quicker Frequentist
method (I know Bayesian methods are probably a lot quicker here), that
would be nice.
My questions are:
1) Is this in line with what people would suggest?
2) Is there literature available that recommends this approach.
I'm thinking Gelman's Data Analysis Using Regression and
Multilevel/Hierarchical Models (p 101) for estimation by first predicting
and then averaging, however, he focuses on predictive differences (which is
not what I'm looking at here), though I assume the suggestion would hold
for estimating average probabilities that do not involve differences.
For the confidence intervals, it is sort of touched on in the GLMM FAQ:
https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html
by mentioning that none of the suggested approaches for confidence
intervals / prediction intervals take into account the random effects.
Thanks,
Clark
[[alternative HTML version deleted]]
method for producing estimates of the probability averaged over yet
unobserved individuals along with confidence intervals on the average
probability for mixed effects logistic regression.
The basic question is this: for a particular set of covariates, what is the
average probability that people will choose the response y = 1, and how
confident are we in this average.
I have been using the following method:
For the estimate of the average probability, I predict the probability of
y=1 for each individual in the data, and then average the probability over
these individuals. For confidence intervals, I use the non-parametric
bootstrap percentile method (bootstrapping individuals and using the
previous method to estimate the average probability). This typically takes
a while to finish, which is ok, though if there were a quicker Frequentist
method (I know Bayesian methods are probably a lot quicker here), that
would be nice.
My questions are:
1) Is this in line with what people would suggest?
2) Is there literature available that recommends this approach.
I'm thinking Gelman's Data Analysis Using Regression and
Multilevel/Hierarchical Models (p 101) for estimation by first predicting
and then averaging, however, he focuses on predictive differences (which is
not what I'm looking at here), though I assume the suggestion would hold
for estimating average probabilities that do not involve differences.
For the confidence intervals, it is sort of touched on in the GLMM FAQ:
https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html
by mentioning that none of the suggested approaches for confidence
intervals / prediction intervals take into account the random effects.
Thanks,
Clark
[[alternative HTML version deleted]]