Nik Tuzov
2018-10-30 15:45:20 UTC
What is the best package to analyze RM data in R, given that time is on continuous scale and the time pointsare not equally spaced. Ideally, that package should match SAS results.
Regards,Nik
From: "r-sig-mixed-models-***@r-project.org" <r-sig-mixed-models-***@r-project.org>
To: r-sig-mixed-***@r-project.org
Sent: Monday, October 29, 2018 6:58 PM
Subject: R-sig-mixed-models Digest, Vol 142, Issue 28
Send R-sig-mixed-models mailing list submissions to
r-sig-mixed-***@r-project.org
To subscribe or unsubscribe via the World Wide Web, visit
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
or, via email, send a message with subject or body 'help' to
r-sig-mixed-models-***@r-project.org
You can reach the person managing the list at
r-sig-mixed-models-***@r-project.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-sig-mixed-models digest..."
Today's Topics:
1. Re: Simulations in Fisher's Exact test (Alexandre Courtiol)
2. Re: glmmTMB (Ben Bolker)
3. Re: Random intercept model- unbalanced cluster (Yashree Mehta)
4. Re: Random intercept model- unbalanced cluster (Ben Bolker)
5. zero-inflated glmmTMB with poly() - confidence band (John Wilson)
----------------------------------------------------------------------
Message: 1
Date: Mon, 29 Oct 2018 12:11:24 +0100
From: Alexandre Courtiol <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: Re: [R-sig-ME] Simulations in Fisher's Exact test
Message-ID:
<CAERMt4dwmh1rLe_gdLBw8_K-***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
to compute the p-value of the test. With simulate.p.value = TRUE (when it
has an effect -> see Ben's comment), it will only sample the space of
possible contingency tables. If the number of possible contingency tables
is not too large, there is not need to use simulate.p.value and if it is
lower than the number of table you simulate, then you should obtain nearly
the same results anyhow. In other words, I don't really understand what you
are trying to achieve but a simple call to fisher.test(ntable) should do
the job.
++
Alex
[[alternative HTML version deleted]]
------------------------------
Message: 2
Date: Mon, 29 Oct 2018 08:48:05 -0400
From: Ben Bolker <***@gmail.com>
To: ***@gmail.com
Cc: R SIG Mixed Models <r-sig-mixed-***@r-project.org>
Subject: Re: [R-sig-ME] glmmTMB
Message-ID:
<CABghstQFdj44Awgfhj66ZTSEAarx1xTGOVZk=***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
glmmTMB implements REML by treating the fixed effect parameters as
'random variables', i.e. turning on the Laplace approximation
machinery; see http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#reml-for-glmms
for starting points in the literature.
https://github.com/glmmTMB/glmmTMB/blob/master/glmmTMB/R/glmmTMB.R#L191
On Mon, Oct 29, 2018 at 6:26 AM Maira Fatoretto
Message: 3
Date: Mon, 29 Oct 2018 22:13:06 +0100
From: Yashree Mehta <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: Re: [R-sig-ME] Random intercept model- unbalanced cluster
Message-ID:
<CAOE=hq+3u9pAr5Xp-BMWeQ4JNE-ZVpm6YYS=***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Or is there an alternative method of modeling this subset of households who
only own one plot?
thank you,
Regards,
Yashree
------------------------------
Message: 4
Date: Mon, 29 Oct 2018 17:23:16 -0400
From: Ben Bolker <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: Re: [R-sig-ME] Random intercept model- unbalanced cluster
Message-ID: <1a88c90a-8869-3b01-072d-***@gmail.com>
Content-Type: text/plain; charset="utf-8"
In principle lme4 shouldn't have problems with a subset of groups that
have only one observation (although clearly the model will get more
fragile/unreliable the less information is available about within vs
among group variation ...). I'd expect the random effects for groups
with only one observation to be strongly shrunk toward the population
mean ... if in doubt, it can be very useful to simulate a situation
similar to your real data set to see what happens in cases where you
know the real answer ...
Message: 5
Date: Mon, 29 Oct 2018 20:58:09 -0300
From: John Wilson <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: [R-sig-ME] zero-inflated glmmTMB with poly() - confidence
band
Message-ID:
<CABdA5Q0o2nJTRFo+mj6Hcz_UbZ+***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hello,
I've been using the newly documented predict() with group = NA to predict
population-level values, as per the thread here (
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q4/027305.html). A
follow-up question: in the case of a zero-inflated model, how would I go
about to get the 95% CIs for response predictions? Obviously, I can get
them for the count and the zero-component, and without poly(), I would just
follow the example in Brooks et al (2017).
However, I have a poly() predictor; how do I get the 95% CIs if I can't use
model.matrix naively when I have a poly() in the model? The models take a
while to converge, so I don't want to run a full bootstrapping either, if
at all possible.
Thank you!
John
Here's a toy dataset:
library(ggplot2)
library(glmmTMB)
set.seed(0)
x <- 1:20
z <- sample(c("a", "b"), length(x), replace = TRUE)
y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
y[c(2, 3, 5, 11, 13, 19)] <- 0
group <- sample(c("i", "ii"), length(x), replace = TRUE)
df <- data.frame(x = x, y = y, z = z, group = group)
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z))
m <- glmmTMB(y ~ poly(x, 3) * z +
(1 | group),
zi = ~ z,
family = nbinom1,
data = df)
# prediction on a new grid
newdata <- expand.grid(x = 1:20, z = unique(df$z), group = NA)
newdata$Pred <- predict(m, type = "response")
### now how to add CIs?
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z)) +
geom_line(data = newdata, aes(x = x, y = Pred, colour = z, group = z), size
= 2) +
facet_wrap(~ z)
[[alternative HTML version deleted]]
------------------------------
Subject: Digest Footer
_______________________________________________
R-sig-mixed-models mailing list
R-sig-mixed-***@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
------------------------------
End of R-sig-mixed-models Digest, Vol 142, Issue 28
***************************************************
[[alternative HTML version deleted]]
Regards,Nik
From: "r-sig-mixed-models-***@r-project.org" <r-sig-mixed-models-***@r-project.org>
To: r-sig-mixed-***@r-project.org
Sent: Monday, October 29, 2018 6:58 PM
Subject: R-sig-mixed-models Digest, Vol 142, Issue 28
Send R-sig-mixed-models mailing list submissions to
r-sig-mixed-***@r-project.org
To subscribe or unsubscribe via the World Wide Web, visit
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
or, via email, send a message with subject or body 'help' to
r-sig-mixed-models-***@r-project.org
You can reach the person managing the list at
r-sig-mixed-models-***@r-project.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-sig-mixed-models digest..."
Today's Topics:
1. Re: Simulations in Fisher's Exact test (Alexandre Courtiol)
2. Re: glmmTMB (Ben Bolker)
3. Re: Random intercept model- unbalanced cluster (Yashree Mehta)
4. Re: Random intercept model- unbalanced cluster (Ben Bolker)
5. zero-inflated glmmTMB with poly() - confidence band (John Wilson)
----------------------------------------------------------------------
Message: 1
Date: Mon, 29 Oct 2018 12:11:24 +0100
From: Alexandre Courtiol <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: Re: [R-sig-ME] Simulations in Fisher's Exact test
Message-ID:
<CAERMt4dwmh1rLe_gdLBw8_K-***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Message: 2
Date: Sun, 28 Oct 2018 12:18:44 -0400
Subject: Re: [R-sig-ME] Simulations in Fisher's Exact test
Content-Type: text/plain; charset="utf-8"
might be more appropriate. I am guessing this is an assignment for a
class? If so, it might be more useful to get clarification from your
instructor or teaching assistant (or a colleague in your class). The
simulate.p.value: a logical indicating whether to compute p-values by
Monte Carlo simulation, **in larger than 2 by 2 tables**.
Emphasis (**) added. Since you're using a 2x2 table, I'm guessing that
simulate.p.value has no effect ... R probably should warn you, but oh
well ..
to create all possible contingency tables (keeping marginal sum constant)Date: Sun, 28 Oct 2018 12:18:44 -0400
Subject: Re: [R-sig-ME] Simulations in Fisher's Exact test
Content-Type: text/plain; charset="utf-8"
might be more appropriate. I am guessing this is an assignment for a
class? If so, it might be more useful to get clarification from your
instructor or teaching assistant (or a colleague in your class). The
simulate.p.value: a logical indicating whether to compute p-values by
Monte Carlo simulation, **in larger than 2 by 2 tables**.
Emphasis (**) added. Since you're using a 2x2 table, I'm guessing that
simulate.p.value has no effect ... R probably should warn you, but oh
well ..
hi all,
Probably I am posted in wrong mailing list. I am getting a problem in
applying Fisher's exact test. I am applying Fisher's exact test as
ntable<- array(data = c(3, 1, 8,11), dim = c(2,2))
fisher.test(ntable)
now, I have to repeat the same 10000 times and have to report p-values.
Using the arguments simulate.p.value in the command is producing the same
results
test<-fisher.test(ntable,workspace=20000,simulate.p.value=T,B=10000)
what changes I have to made in my model.
regards
Adeela
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Adeela, with simulate.p.value = FALSE, the Fisher exact test will attemptProbably I am posted in wrong mailing list. I am getting a problem in
applying Fisher's exact test. I am applying Fisher's exact test as
ntable<- array(data = c(3, 1, 8,11), dim = c(2,2))
fisher.test(ntable)
now, I have to repeat the same 10000 times and have to report p-values.
Using the arguments simulate.p.value in the command is producing the same
results
test<-fisher.test(ntable,workspace=20000,simulate.p.value=T,B=10000)
what changes I have to made in my model.
regards
Adeela
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
to compute the p-value of the test. With simulate.p.value = TRUE (when it
has an effect -> see Ben's comment), it will only sample the space of
possible contingency tables. If the number of possible contingency tables
is not too large, there is not need to use simulate.p.value and if it is
lower than the number of table you simulate, then you should obtain nearly
the same results anyhow. In other words, I don't really understand what you
are trying to achieve but a simple call to fisher.test(ntable) should do
the job.
++
Alex
[[alternative HTML version deleted]]
------------------------------
Message: 2
Date: Mon, 29 Oct 2018 08:48:05 -0400
From: Ben Bolker <***@gmail.com>
To: ***@gmail.com
Cc: R SIG Mixed Models <r-sig-mixed-***@r-project.org>
Subject: Re: [R-sig-ME] glmmTMB
Message-ID:
<CABghstQFdj44Awgfhj66ZTSEAarx1xTGOVZk=***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
glmmTMB implements REML by treating the fixed effect parameters as
'random variables', i.e. turning on the Laplace approximation
machinery; see http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#reml-for-glmms
for starting points in the literature.
https://github.com/glmmTMB/glmmTMB/blob/master/glmmTMB/R/glmmTMB.R#L191
On Mon, Oct 29, 2018 at 6:26 AM Maira Fatoretto
Hello,
I am using glmmTMb with binomial family. I chose REML=TRUE because I have
three random effects and I would like to know which is the restriction used
in this estimation in the package.
The REML = TRUE have better estimation than ML in my case.
Thank you.
Best regards.
--
*Maíra Blumer Fatoretto*
Estatística - Universidade Estadual de Campinas- UNICAMP
Mestra em Ciências(Estatística e Experimentação Agronômica) - ESALQ/USP -
Piracicaba - SP
Doutoranda em Estatística e Experimentação Agronômica
Telefone:(19) 988309481
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
------------------------------I am using glmmTMb with binomial family. I chose REML=TRUE because I have
three random effects and I would like to know which is the restriction used
in this estimation in the package.
The REML = TRUE have better estimation than ML in my case.
Thank you.
Best regards.
--
*Maíra Blumer Fatoretto*
Estatística - Universidade Estadual de Campinas- UNICAMP
Mestra em Ciências(Estatística e Experimentação Agronômica) - ESALQ/USP -
Piracicaba - SP
Doutoranda em Estatística e Experimentação Agronômica
Telefone:(19) 988309481
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Message: 3
Date: Mon, 29 Oct 2018 22:13:06 +0100
From: Yashree Mehta <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: Re: [R-sig-ME] Random intercept model- unbalanced cluster
Message-ID:
<CAOE=hq+3u9pAr5Xp-BMWeQ4JNE-ZVpm6YYS=***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Or is there an alternative method of modeling this subset of households who
only own one plot?
thank you,
Regards,
Yashree
Hi,
I am working with a random intercept model on a cluster dataset (Repeated
measurements of plots per household). I have the usual "X" vector
of covariates and one id variable which will make up the random
intercept. For example,
Response variable: Production of maize
Covariates: Size, input quantities, soil fertility dummies etc..
ID variable: Household_ID
However, about 40% of the households own one plot. The number of plots per
household ranges from 1 to 13.
When I estimated the random intercept model using lmer, I can extract a
random intercept for all households, irrespective of their number of plots.
How does lmer treat these households with just 1 plot? Also, is it
theoretically correct to include these observations ?
Thank you,
Regards,
Yashree
[[alternative HTML version deleted]]I am working with a random intercept model on a cluster dataset (Repeated
measurements of plots per household). I have the usual "X" vector
of covariates and one id variable which will make up the random
intercept. For example,
Response variable: Production of maize
Covariates: Size, input quantities, soil fertility dummies etc..
ID variable: Household_ID
However, about 40% of the households own one plot. The number of plots per
household ranges from 1 to 13.
When I estimated the random intercept model using lmer, I can extract a
random intercept for all households, irrespective of their number of plots.
How does lmer treat these households with just 1 plot? Also, is it
theoretically correct to include these observations ?
Thank you,
Regards,
Yashree
------------------------------
Message: 4
Date: Mon, 29 Oct 2018 17:23:16 -0400
From: Ben Bolker <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: Re: [R-sig-ME] Random intercept model- unbalanced cluster
Message-ID: <1a88c90a-8869-3b01-072d-***@gmail.com>
Content-Type: text/plain; charset="utf-8"
In principle lme4 shouldn't have problems with a subset of groups that
have only one observation (although clearly the model will get more
fragile/unreliable the less information is available about within vs
among group variation ...). I'd expect the random effects for groups
with only one observation to be strongly shrunk toward the population
mean ... if in doubt, it can be very useful to simulate a situation
similar to your real data set to see what happens in cases where you
know the real answer ...
Or is there an alternative method of modeling this subset of households who
only own one plot?
thank you,
Regards,
Yashree
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
------------------------------only own one plot?
thank you,
Regards,
Yashree
Hi,
I am working with a random intercept model on a cluster dataset (Repeated
measurements of plots per household). I have the usual "X" vector
of covariates and one id variable which will make up the random
intercept. For example,
Response variable: Production of maize
Covariates: Size, input quantities, soil fertility dummies etc..
ID variable: Household_ID
However, about 40% of the households own one plot. The number of plots per
household ranges from 1 to 13.
When I estimated the random intercept model using lmer, I can extract a
random intercept for all households, irrespective of their number of plots.
How does lmer treat these households with just 1 plot? Also, is it
theoretically correct to include these observations ?
Thank you,
Regards,
Yashree
[[alternative HTML version deleted]]I am working with a random intercept model on a cluster dataset (Repeated
measurements of plots per household). I have the usual "X" vector
of covariates and one id variable which will make up the random
intercept. For example,
Response variable: Production of maize
Covariates: Size, input quantities, soil fertility dummies etc..
ID variable: Household_ID
However, about 40% of the households own one plot. The number of plots per
household ranges from 1 to 13.
When I estimated the random intercept model using lmer, I can extract a
random intercept for all households, irrespective of their number of plots.
How does lmer treat these households with just 1 plot? Also, is it
theoretically correct to include these observations ?
Thank you,
Regards,
Yashree
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Message: 5
Date: Mon, 29 Oct 2018 20:58:09 -0300
From: John Wilson <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: [R-sig-ME] zero-inflated glmmTMB with poly() - confidence
band
Message-ID:
<CABdA5Q0o2nJTRFo+mj6Hcz_UbZ+***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hello,
I've been using the newly documented predict() with group = NA to predict
population-level values, as per the thread here (
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q4/027305.html). A
follow-up question: in the case of a zero-inflated model, how would I go
about to get the 95% CIs for response predictions? Obviously, I can get
them for the count and the zero-component, and without poly(), I would just
follow the example in Brooks et al (2017).
However, I have a poly() predictor; how do I get the 95% CIs if I can't use
model.matrix naively when I have a poly() in the model? The models take a
while to converge, so I don't want to run a full bootstrapping either, if
at all possible.
Thank you!
John
Here's a toy dataset:
library(ggplot2)
library(glmmTMB)
set.seed(0)
x <- 1:20
z <- sample(c("a", "b"), length(x), replace = TRUE)
y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
y[c(2, 3, 5, 11, 13, 19)] <- 0
group <- sample(c("i", "ii"), length(x), replace = TRUE)
df <- data.frame(x = x, y = y, z = z, group = group)
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z))
m <- glmmTMB(y ~ poly(x, 3) * z +
(1 | group),
zi = ~ z,
family = nbinom1,
data = df)
# prediction on a new grid
newdata <- expand.grid(x = 1:20, z = unique(df$z), group = NA)
newdata$Pred <- predict(m, type = "response")
### now how to add CIs?
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z)) +
geom_line(data = newdata, aes(x = x, y = Pred, colour = z, group = z), size
= 2) +
facet_wrap(~ z)
[[alternative HTML version deleted]]
------------------------------
Subject: Digest Footer
_______________________________________________
R-sig-mixed-models mailing list
R-sig-mixed-***@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
------------------------------
End of R-sig-mixed-models Digest, Vol 142, Issue 28
***************************************************
[[alternative HTML version deleted]]