Discussion:
[R-sig-ME] fitting beta and zero mixture model containing both nested and crossed random effects
Meng Liu
2018-06-09 19:06:32 UTC
Permalink
To whom it may concern,

I am trying to fit a model for a data among which the response value is
within [0,1). I am thinking about fitting the zeros as a complete separate
category from the non-zero data, i.e. a binomial (Bernoulli) model to "==0
vs >0" and a Beta model to the >0 responses. Also, my data contains both
nested factors and crossed factors, which means I need to add nested random
effects and crossed random effects to both logistic model part and beta
model model. However, I didn't find any R packages can do exactly what I
want (By far I found gamlss, glmmTMB, zoib but they either can only assume
random zero or they can only fit repeated measures/clustered data but not
nested and crossed design). Therefore, I am wondering if any one know if
there is any available package or function can do this.

Thank you very much for your help!

Best regards

Meng

[[alternative HTML version deleted]]
Guillaume Chaumet
2018-06-10 15:04:22 UTC
Permalink
---------- Forwarded message ----------
From: Guillaume Chaumet <***@gmail.com>
Date: 2018-06-10 17:03 GMT+02:00
Subject: Re: [R-sig-ME] fitting beta and zero mixture model containing
both nested and crossed random effects
To: Meng Liu <***@usc.edu>


brms: https://cran.r-project.org/web/packages/brms/index.html
Post by Meng Liu
To whom it may concern,
I am trying to fit a model for a data among which the response value is
within [0,1). I am thinking about fitting the zeros as a complete separate
category from the non-zero data, i.e. a binomial (Bernoulli) model to "==0
vs >0" and a Beta model to the >0 responses. Also, my data contains both
nested factors and crossed factors, which means I need to add nested random
effects and crossed random effects to both logistic model part and beta
model model. However, I didn't find any R packages can do exactly what I
want (By far I found gamlss, glmmTMB, zoib but they either can only assume
random zero or they can only fit repeated measures/clustered data but not
nested and crossed design). Therefore, I am wondering if any one know if
there is any available package or function can do this.
Thank you very much for your help!
Best regards
Meng
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Ben Bolker
2018-06-12 01:55:48 UTC
Permalink
I think Guillaume is right that brms can do this, but I don't see why
glmmTMB *can't* do all of this (fixed and (nested and/or crossed
random effects)) in both the binomial (logistic) and beta components)

On Sun, Jun 10, 2018 at 11:04 AM, Guillaume Chaumet
Post by Guillaume Chaumet
---------- Forwarded message ----------
Date: 2018-06-10 17:03 GMT+02:00
Subject: Re: [R-sig-ME] fitting beta and zero mixture model containing
both nested and crossed random effects
brms: https://cran.r-project.org/web/packages/brms/index.html
Post by Meng Liu
To whom it may concern,
I am trying to fit a model for a data among which the response value is
within [0,1). I am thinking about fitting the zeros as a complete separate
category from the non-zero data, i.e. a binomial (Bernoulli) model to "==0
vs >0" and a Beta model to the >0 responses. Also, my data contains both
nested factors and crossed factors, which means I need to add nested random
effects and crossed random effects to both logistic model part and beta
model model. However, I didn't find any R packages can do exactly what I
want (By far I found gamlss, glmmTMB, zoib but they either can only assume
random zero or they can only fit repeated measures/clustered data but not
nested and crossed design). Therefore, I am wondering if any one know if
there is any available package or function can do this.
Thank you very much for your help!
Best regards
Meng
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Guillaume Chaumet
2018-06-13 08:26:07 UTC
Permalink
My bad, I replied to you the first time without including the list.
Regarding your last question, perhaps the list and/or Ben could
provide a more accurate answer than me.
I'm also curious to know how glmmTMB could do that
Hi Guillaume,
Thank you so much for this! I just have another question: for example if I
have random factor A and B in both logistic model part and beta model part,
then after I fit the whole model and got variance component estimation of
random effect for factor A and B for both logistic model part and beta model
model part, will there be any way to combine variance together? I.e. I can
estimate a total variance from factor A, and a total variance from factor B
(i.e. only differ by factor, not model)? Something like variance
decomposition but I believe here is more complex as this is a mixture model.
Thank you again for all your help
Best regards,
Meng
On Sun, Jun 10, 2018 at 11:03 AM, Guillaume Chaumet
https://urldefense.proofpoint.com/v2/url?u=https-3A__cran.r-2Dproject.org_web_packages_brms_index.html&d=DwIBaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=Ij73g98b5MaGitndhxmoIw&m=Uy-z_keMG1SfZG-g8FxVqzfz-Ghl2OHun7TY7tfexwo&s=Gfi89kd1PSimpIhWBglYPuJRn3_FF_uNBGvzVDvWe4A&e=
Post by Meng Liu
To whom it may concern,
I am trying to fit a model for a data among which the response value is
within [0,1). I am thinking about fitting the zeros as a complete separate
category from the non-zero data, i.e. a binomial (Bernoulli) model to "==0
vs >0" and a Beta model to the >0 responses. Also, my data contains both
nested factors and crossed factors, which means I need to add nested random
effects and crossed random effects to both logistic model part and beta
model model. However, I didn't find any R packages can do exactly what I
want (By far I found gamlss, glmmTMB, zoib but they either can only assume
random zero or they can only fit repeated measures/clustered data but not
nested and crossed design). Therefore, I am wondering if any one know if
there is any available package or function can do this.
Thank you very much for your help!
Best regards
Meng
[[alternative HTML version deleted]]
_______________________________________________
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=Ij73g98b5MaGitndhxmoIw&m=Uy-z_keMG1SfZG-g8FxVqzfz-Ghl2OHun7TY7tfexwo&s=FMNtOORgf7OlXhD5m8VHoGCnuWlt5NLqtXxalxQOhQw&e=
Ben Bolker
2018-06-13 13:38:59 UTC
Permalink
I'm not sure how this (variance decomposition based on a
zero-inflated model) would work.
What is your subject-area/scientific question?

On Wed, Jun 13, 2018 at 4:26 AM, Guillaume Chaumet
Post by Guillaume Chaumet
My bad, I replied to you the first time without including the list.
Regarding your last question, perhaps the list and/or Ben could
provide a more accurate answer than me.
I'm also curious to know how glmmTMB could do that
Hi Guillaume,
Thank you so much for this! I just have another question: for example if I
have random factor A and B in both logistic model part and beta model part,
then after I fit the whole model and got variance component estimation of
random effect for factor A and B for both logistic model part and beta model
model part, will there be any way to combine variance together? I.e. I can
estimate a total variance from factor A, and a total variance from factor B
(i.e. only differ by factor, not model)? Something like variance
decomposition but I believe here is more complex as this is a mixture model.
Thank you again for all your help
Best regards,
Meng
On Sun, Jun 10, 2018 at 11:03 AM, Guillaume Chaumet
https://urldefense.proofpoint.com/v2/url?u=https-3A__cran.r-2Dproject.org_web_packages_brms_index.html&d=DwIBaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=Ij73g98b5MaGitndhxmoIw&m=Uy-z_keMG1SfZG-g8FxVqzfz-Ghl2OHun7TY7tfexwo&s=Gfi89kd1PSimpIhWBglYPuJRn3_FF_uNBGvzVDvWe4A&e=
Post by Meng Liu
To whom it may concern,
I am trying to fit a model for a data among which the response value is
within [0,1). I am thinking about fitting the zeros as a complete separate
category from the non-zero data, i.e. a binomial (Bernoulli) model to "==0
vs >0" and a Beta model to the >0 responses. Also, my data contains both
nested factors and crossed factors, which means I need to add nested random
effects and crossed random effects to both logistic model part and beta
model model. However, I didn't find any R packages can do exactly what I
want (By far I found gamlss, glmmTMB, zoib but they either can only assume
random zero or they can only fit repeated measures/clustered data but not
nested and crossed design). Therefore, I am wondering if any one know if
there is any available package or function can do this.
Thank you very much for your help!
Best regards
Meng
[[alternative HTML version deleted]]
_______________________________________________
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=Ij73g98b5MaGitndhxmoIw&m=Uy-z_keMG1SfZG-g8FxVqzfz-Ghl2OHun7TY7tfexwo&s=FMNtOORgf7OlXhD5m8VHoGCnuWlt5NLqtXxalxQOhQw&e=
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Meng Liu
2018-06-13 14:47:18 UTC
Permalink
Hi Ben and Guilaume,

Thank you for reply. I am working on a precision experiment design, in
which a sample will be tested using different assay, by different operator
at different site. The measurement is allele frequency of DNA, which is a
continuous proportion outcome. I originally plan to run a beta distribution
random effects model, among which assay, operator and site are all random
factors. However, because I found there are some zeros in the response
data, that's why I am trying to run a zero-inflated beta random effects
model, with random factors in both zero part and non zero part. I.e., we
assume there will be variance from each factor in terms of predicting zero,
and variance from each factor in terms of continuous data. However, the
final research question would be evaluating the total variance contributed
from each factor.I can see here it is more complex for just generalized
linear model because of random effects from two different models. I am
wondering if you have any idea on this or do you know anybody who might
have thoughts on this?

Thank you again for all help!

Best regards,

Meng
Post by Ben Bolker
I'm not sure how this (variance decomposition based on a
zero-inflated model) would work.
What is your subject-area/scientific question?
On Wed, Jun 13, 2018 at 4:26 AM, Guillaume Chaumet
Post by Guillaume Chaumet
My bad, I replied to you the first time without including the list.
Regarding your last question, perhaps the list and/or Ben could
provide a more accurate answer than me.
I'm also curious to know how glmmTMB could do that
Hi Guillaume,
Thank you so much for this! I just have another question: for example
if I
Post by Guillaume Chaumet
have random factor A and B in both logistic model part and beta model
part,
Post by Guillaume Chaumet
then after I fit the whole model and got variance component estimation
of
Post by Guillaume Chaumet
random effect for factor A and B for both logistic model part and beta
model
Post by Guillaume Chaumet
model part, will there be any way to combine variance together? I.e. I
can
Post by Guillaume Chaumet
estimate a total variance from factor A, and a total variance from
factor B
Post by Guillaume Chaumet
(i.e. only differ by factor, not model)? Something like variance
decomposition but I believe here is more complex as this is a mixture
model.
Post by Guillaume Chaumet
Thank you again for all your help
Best regards,
Meng
On Sun, Jun 10, 2018 at 11:03 AM, Guillaume Chaumet
https://urldefense.proofpoint.com/v2/url?u=https-3A__cran.r-
2Dproject.org_web_packages_brms_index.html&d=DwIBaQ&c=
clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=
Ij73g98b5MaGitndhxmoIw&m=Uy-z_keMG1SfZG-g8FxVqzfz-Ghl2OHun7TY7tfexwo&s=
Gfi89kd1PSimpIhWBglYPuJRn3_FF_uNBGvzVDvWe4A&e=
Post by Guillaume Chaumet
Post by Meng Liu
To whom it may concern,
I am trying to fit a model for a data among which the response value
is
Post by Guillaume Chaumet
Post by Meng Liu
within [0,1). I am thinking about fitting the zeros as a complete separate
category from the non-zero data, i.e. a binomial (Bernoulli) model to "==0
vs >0" and a Beta model to the >0 responses. Also, my data contains
both
Post by Guillaume Chaumet
Post by Meng Liu
nested factors and crossed factors, which means I need to add nested random
effects and crossed random effects to both logistic model part and
beta
Post by Guillaume Chaumet
Post by Meng Liu
model model. However, I didn't find any R packages can do exactly
what I
Post by Guillaume Chaumet
Post by Meng Liu
want (By far I found gamlss, glmmTMB, zoib but they either can only assume
random zero or they can only fit repeated measures/clustered data but not
nested and crossed design). Therefore, I am wondering if any one
know if
Post by Guillaume Chaumet
Post by Meng Liu
there is any available package or function can do this.
Thank you very much for your help!
Best regards
Meng
[[alternative HTML version deleted]]
_______________________________________________
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.
ethz.ch_mailman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=
clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=
Ij73g98b5MaGitndhxmoIw&m=Uy-z_keMG1SfZG-g8FxVqzfz-Ghl2OHun7TY7tfexwo&s=
FMNtOORgf7OlXhD5m8VHoGCnuWlt5NLqtXxalxQOhQw&e=
Post by Guillaume Chaumet
_______________________________________________
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.
ethz.ch_mailman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIFaQ&c=
clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=Ij73g98b5MaGitndhxmoIw&m=
2BcoGwnjTH2lMtoXsQy3PazIJQlKthmPnzhLeUAe8Iw&s=
ni4tfiCBB6TzaOnxJfYkjSxo5Ztgo3sx1FyYE-Qc2AY&e=
[[alternative HTML version deleted]]
Paul Johnson
2018-06-13 16:35:05 UTC
Permalink
Hi Meng,

Is it possible for you to model the raw data (the counts of alleles within each individual site/assay/etc) rather than the calculated allele frequency (= proportion*)? If you have access to this data it would simplify your model greatly. I guess the reason for having zeroes is that there were no copies of an allele in a subpopulation (site, etc), giving an estimated proportion of zero? If so, you need a model with a binomial distribution for the responses (e.g. cbind(n.minor.allele, n.major.allele)**, assuming two alleles), and you could have a beta distribution for the allele frequencies. Then the model will see zeroes as being drawn from a binomial distribution with a non-zero allele frequency. This is the Balding-Nichols model, which gives a natural way of gauging the strength of random effects via f, which is an analogue of Wright’s Fst (which measures correlation between alleles in a subpopulation relative to the total population):

p.sub ~ Beta(shape1 = (1 - f) / f * p, shape2 = (1 - f) / f * (1 - p))

…for 0 < f < 1. p is the mean allele frequency, p.sub are the subpopulation (site, assay, etc) allele frequencies.

And the distribution of the number of alleles in subpopulation i is:

n.minor.allele[i] ~ Binom(2n[i], p.sub[i])

...n[i] is the number of diploid individuals in subpopulation i.

Alternatively, I guess there’s nothing to stop you having normally distributed allele frequencies on the logit scale, in which case you could fit the model with glmer.

Best wishes,
Paul

*It’s a stats/population genetics language difference — for statisticians frequency means count, while for population geneticists an allele frequency is a proportion (I have a foot in each camp).
**Pooling alleles in this way, i.e. ignoring the individual level, assumes HWE. If you don’t want to do that you can treat a diploid individual as a level nested within site/assay/etc, and estimate an individual level f, which is analogous to Fis, the inbreeding coefficient.
Post by Meng Liu
Hi Ben and Guilaume,
Thank you for reply. I am working on a precision experiment design, in
which a sample will be tested using different assay, by different operator
at different site. The measurement is allele frequency of DNA, which is a
continuous proportion outcome. I originally plan to run a beta distribution
random effects model, among which assay, operator and site are all random
factors. However, because I found there are some zeros in the response
data, that's why I am trying to run a zero-inflated beta random effects
model, with random factors in both zero part and non zero part. I.e., we
assume there will be variance from each factor in terms of predicting zero,
and variance from each factor in terms of continuous data. However, the
final research question would be evaluating the total variance contributed
from each factor.I can see here it is more complex for just generalized
linear model because of random effects from two different models. I am
wondering if you have any idea on this or do you know anybody who might
have thoughts on this?
Thank you again for all help!
Best regards,
Meng
Post by Ben Bolker
I'm not sure how this (variance decomposition based on a
zero-inflated model) would work.
What is your subject-area/scientific question?
On Wed, Jun 13, 2018 at 4:26 AM, Guillaume Chaumet
Post by Guillaume Chaumet
My bad, I replied to you the first time without including the list.
Regarding your last question, perhaps the list and/or Ben could
provide a more accurate answer than me.
I'm also curious to know how glmmTMB could do that
Hi Guillaume,
Thank you so much for this! I just have another question: for example
if I
Post by Guillaume Chaumet
have random factor A and B in both logistic model part and beta model
part,
Post by Guillaume Chaumet
then after I fit the whole model and got variance component estimation
of
Post by Guillaume Chaumet
random effect for factor A and B for both logistic model part and beta
model
Post by Guillaume Chaumet
model part, will there be any way to combine variance together? I.e. I
can
Post by Guillaume Chaumet
estimate a total variance from factor A, and a total variance from
factor B
Post by Guillaume Chaumet
(i.e. only differ by factor, not model)? Something like variance
decomposition but I believe here is more complex as this is a mixture
model.
Post by Guillaume Chaumet
Thank you again for all your help
Best regards,
Meng
On Sun, Jun 10, 2018 at 11:03 AM, Guillaume Chaumet
https://urldefense.proofpoint.com/v2/url?u=https-3A__cran.r-
2Dproject.org_web_packages_brms_index.html&d=DwIBaQ&c=
clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=
Ij73g98b5MaGitndhxmoIw&m=Uy-z_keMG1SfZG-g8FxVqzfz-Ghl2OHun7TY7tfexwo&s=
Gfi89kd1PSimpIhWBglYPuJRn3_FF_uNBGvzVDvWe4A&e=
Post by Guillaume Chaumet
Post by Meng Liu
To whom it may concern,
I am trying to fit a model for a data among which the response value
is
Post by Guillaume Chaumet
Post by Meng Liu
within [0,1). I am thinking about fitting the zeros as a complete separate
category from the non-zero data, i.e. a binomial (Bernoulli) model to "==0
vs >0" and a Beta model to the >0 responses. Also, my data contains
both
Post by Guillaume Chaumet
Post by Meng Liu
nested factors and crossed factors, which means I need to add nested random
effects and crossed random effects to both logistic model part and
beta
Post by Guillaume Chaumet
Post by Meng Liu
model model. However, I didn't find any R packages can do exactly
what I
Post by Guillaume Chaumet
Post by Meng Liu
want (By far I found gamlss, glmmTMB, zoib but they either can only assume
random zero or they can only fit repeated measures/clustered data but not
nested and crossed design). Therefore, I am wondering if any one
know if
Post by Guillaume Chaumet
Post by Meng Liu
there is any available package or function can do this.
Thank you very much for your help!
Best regards
Meng
[[alternative HTML version deleted]]
_______________________________________________
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.
ethz.ch_mailman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=
clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=
Ij73g98b5MaGitndhxmoIw&m=Uy-z_keMG1SfZG-g8FxVqzfz-Ghl2OHun7TY7tfexwo&s=
FMNtOORgf7OlXhD5m8VHoGCnuWlt5NLqtXxalxQOhQw&e=
Post by Guillaume Chaumet
_______________________________________________
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.
ethz.ch_mailman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIFaQ&c=
clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=Ij73g98b5MaGitndhxmoIw&m=
2BcoGwnjTH2lMtoXsQy3PazIJQlKthmPnzhLeUAe8Iw&s=
ni4tfiCBB6TzaOnxJfYkjSxo5Ztgo3sx1FyYE-Qc2AY&e=
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Loading...