Ramon Diaz-Uriarte
2018-06-15 07:42:20 UTC
Dear all,
(Yet another question about response variables in [0, 1]).
I'd like to fit models, that include random effects, to response variables
in [0, 1] (i.e., they can take any value betwee 0 and 1, including 0 and
1). The response variables are averages of values that, themselves, are
not proportions nor binary variables[1].
I'd like to avoid *zero-one-inflated (beta) models* (available, e.g., in
brms), because I do not think that the 0s and 1s are governed by a
different process than the values in (0, 1).
Using a *beta model* (e.g., as available in glmmTMB) after transforming
the response via the sometimes recommended (e.g.,
https://cran.r-project.org/web/packages/betareg/vignettes/betareg.pdf) (y
* (n−1) + 0.5) / n does not seem ideal, since it is not clear what "n"
should be, and I have about 5% of the values exactly 0 or 1.
An alternative would be a *fractional response model*, with a binomial
model and accounting for overdispersion. In
https://stats.stackexchange.com/a/233664, using glmer, it is suggested that
accounting for overdispersion can be done adding a random effect to each
observation (or row of the data); we would also pass a weights vector to
avoid the warning about non-integer values. But an update from January 2018
indicates that might not be a valid approach (and my own experiments with
my data make me uneasy).
Instead of glmer, I could fit a binomial model using MCMCglmm or INLA,
both of which accomodate observation-level random effects (I guess I could
try this with brms, too). I am not sure this is sensible, though, for this
type of response variable.
Any suggestions?
Thanks,
[1] One of the variables, for example, is the Jensen-Shannon divergence
between two distributions, rescaled from [0, log(2)] to [0, 1]
--
Ramon Diaz-Uriarte
Department of Biochemistry, Lab B-25
Facultad de Medicina
Universidad Autónoma de Madrid
Arzobispo Morcillo, 4
28029 Madrid
Spain
Phone: +34-91-497-2412
Email: ***@gmail.com
***@iib.uam.es
http://ligarto.org/rdiaz
(Yet another question about response variables in [0, 1]).
I'd like to fit models, that include random effects, to response variables
in [0, 1] (i.e., they can take any value betwee 0 and 1, including 0 and
1). The response variables are averages of values that, themselves, are
not proportions nor binary variables[1].
I'd like to avoid *zero-one-inflated (beta) models* (available, e.g., in
brms), because I do not think that the 0s and 1s are governed by a
different process than the values in (0, 1).
Using a *beta model* (e.g., as available in glmmTMB) after transforming
the response via the sometimes recommended (e.g.,
https://cran.r-project.org/web/packages/betareg/vignettes/betareg.pdf) (y
* (n−1) + 0.5) / n does not seem ideal, since it is not clear what "n"
should be, and I have about 5% of the values exactly 0 or 1.
An alternative would be a *fractional response model*, with a binomial
model and accounting for overdispersion. In
https://stats.stackexchange.com/a/233664, using glmer, it is suggested that
accounting for overdispersion can be done adding a random effect to each
observation (or row of the data); we would also pass a weights vector to
avoid the warning about non-integer values. But an update from January 2018
indicates that might not be a valid approach (and my own experiments with
my data make me uneasy).
Instead of glmer, I could fit a binomial model using MCMCglmm or INLA,
both of which accomodate observation-level random effects (I guess I could
try this with brms, too). I am not sure this is sensible, though, for this
type of response variable.
Any suggestions?
Thanks,
[1] One of the variables, for example, is the Jensen-Shannon divergence
between two distributions, rescaled from [0, log(2)] to [0, 1]
--
Ramon Diaz-Uriarte
Department of Biochemistry, Lab B-25
Facultad de Medicina
Universidad Autónoma de Madrid
Arzobispo Morcillo, 4
28029 Madrid
Spain
Phone: +34-91-497-2412
Email: ***@gmail.com
***@iib.uam.es
http://ligarto.org/rdiaz