Discussion:
[R-sig-ME] Repeated measures - general question
Nik Tuzov
2018-10-30 15:45:20 UTC
Permalink
What is the best package to analyze RM data in R, given that time is on continuous scale and the time pointsare not equally spaced. Ideally, that package should match SAS results.
Regards,Nik

From: "r-sig-mixed-models-***@r-project.org" <r-sig-mixed-models-***@r-project.org>
To: r-sig-mixed-***@r-project.org
Sent: Monday, October 29, 2018 6:58 PM
Subject: R-sig-mixed-models Digest, Vol 142, Issue 28

Send R-sig-mixed-models mailing list submissions to
    r-sig-mixed-***@r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
    https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
or, via email, send a message with subject or body 'help' to
    r-sig-mixed-models-***@r-project.org

You can reach the person managing the list at
    r-sig-mixed-models-***@r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-sig-mixed-models digest..."


Today's Topics:

  1. Re: Simulations in Fisher's Exact test (Alexandre Courtiol)
  2. Re: glmmTMB (Ben Bolker)
  3. Re: Random intercept model- unbalanced cluster (Yashree Mehta)
  4. Re: Random intercept model- unbalanced cluster (Ben Bolker)
  5. zero-inflated glmmTMB with poly() - confidence band (John Wilson)

----------------------------------------------------------------------

Message: 1
Date: Mon, 29 Oct 2018 12:11:24 +0100
From: Alexandre Courtiol <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: Re: [R-sig-ME] Simulations in Fisher's Exact test
Message-ID:
    <CAERMt4dwmh1rLe_gdLBw8_K-***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Message: 2
Date: Sun, 28 Oct 2018 12:18:44 -0400
Subject: Re: [R-sig-ME] Simulations in Fisher's Exact test
Content-Type: text/plain; charset="utf-8"
might be more appropriate.  I am guessing this is an assignment for a
class?  If so, it might be more useful to get clarification from your
instructor or teaching assistant (or a colleague in your class). The
simulate.p.value: a logical indicating whether to compute p-values by
          Monte Carlo simulation, **in larger than 2 by 2 tables**.
Emphasis (**) added.  Since you're using a 2x2 table, I'm guessing that
simulate.p.value has no effect ...  R probably should warn you, but oh
well ..
hi all,
Probably I am posted in wrong mailing list. I am getting a problem in
applying Fisher's exact test. I am applying Fisher's exact test as
  ntable<- array(data = c(3, 1, 8,11), dim = c(2,2))
fisher.test(ntable)
now, I have to repeat the same 10000 times and have to report p-values.
Using the arguments simulate.p.value in the command is producing the same
results
test<-fisher.test(ntable,workspace=20000,simulate.p.value=T,B=10000)
what changes I have to made in my model.
regards
Adeela
      [[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Adeela, with simulate.p.value = FALSE, the Fisher exact test will attempt
to create all possible contingency tables (keeping marginal sum constant)
to compute the p-value of the test. With simulate.p.value = TRUE (when it
has an effect -> see Ben's comment), it will only sample the space of
possible contingency tables. If the number of possible contingency tables
is not too large, there is not need to use simulate.p.value and if it is
lower than the number of table you simulate, then you should obtain nearly
the same results anyhow. In other words, I don't really understand what you
are trying to achieve but a simple call to fisher.test(ntable) should do
the job.
++
Alex

    [[alternative HTML version deleted]]




------------------------------

Message: 2
Date: Mon, 29 Oct 2018 08:48:05 -0400
From: Ben Bolker <***@gmail.com>
To: ***@gmail.com
Cc: R SIG Mixed Models <r-sig-mixed-***@r-project.org>
Subject: Re: [R-sig-ME] glmmTMB
Message-ID:
    <CABghstQFdj44Awgfhj66ZTSEAarx1xTGOVZk=***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

  glmmTMB implements REML by treating the fixed effect parameters as
'random variables', i.e. turning on the Laplace approximation
machinery; see http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#reml-for-glmms
for starting points in the literature.

https://github.com/glmmTMB/glmmTMB/blob/master/glmmTMB/R/glmmTMB.R#L191
On Mon, Oct 29, 2018 at 6:26 AM Maira Fatoretto
Hello,
I am using glmmTMb with binomial family. I chose REML=TRUE  because I have
three random effects and I would like to know which is the restriction used
in this  estimation in the package.
The REML = TRUE have better estimation than ML in my case.
Thank you.
Best regards.
--
*Maíra Blumer Fatoretto*
Estatística - Universidade Estadual de Campinas- UNICAMP
Mestra em Ciências(Estatística e Experimentação Agronômica) - ESALQ/USP -
Piracicaba - SP
Doutoranda em Estatística e Experimentação Agronômica
Telefone:(19) 988309481
        [[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
------------------------------

Message: 3
Date: Mon, 29 Oct 2018 22:13:06 +0100
From: Yashree Mehta <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: Re: [R-sig-ME] Random intercept model- unbalanced cluster
Message-ID:
    <CAOE=hq+3u9pAr5Xp-BMWeQ4JNE-ZVpm6YYS=***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Or is there an alternative method of modeling this subset of households who
only own one plot?

thank you,

Regards,
Yashree
Hi,
I am working with a random intercept model on a cluster dataset (Repeated
measurements of plots per household). I have the usual "X" vector
of covariates and one id variable which will make up the random
intercept. For example,
Response variable: Production of maize
Covariates: Size, input quantities, soil fertility dummies etc..
ID variable: Household_ID
However, about 40% of the households own one plot. The number of plots per
household ranges from 1 to 13.
When I estimated the random intercept model using lmer, I can extract a
random intercept for all households, irrespective of their number of plots.
How does lmer treat these households with just 1 plot? Also, is it
theoretically correct to include these observations ?
Thank you,
Regards,
Yashree
    [[alternative HTML version deleted]]




------------------------------

Message: 4
Date: Mon, 29 Oct 2018 17:23:16 -0400
From: Ben Bolker <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: Re: [R-sig-ME] Random intercept model- unbalanced cluster
Message-ID: <1a88c90a-8869-3b01-072d-***@gmail.com>
Content-Type: text/plain; charset="utf-8"


  In principle lme4 shouldn't have problems with a subset of groups that
have only one observation (although clearly the model will get more
fragile/unreliable the less information is available about within vs
among group variation ...).  I'd expect the random effects for groups
with only one observation to be strongly shrunk toward the population
mean ... if in doubt, it can be very useful to simulate a situation
similar to your real data set to see what happens in cases where you
know the real answer ...
Or is there an alternative method of modeling this subset of households who
only own one plot?
thank you,
Regards,
Yashree
Hi,
I am working with a random intercept model on a cluster dataset (Repeated
measurements of plots per household). I have the usual "X" vector
of covariates and one id variable which will make up the random
intercept. For example,
Response variable: Production of maize
Covariates: Size, input quantities, soil fertility dummies etc..
ID variable: Household_ID
However, about 40% of the households own one plot. The number of plots per
household ranges from 1 to 13.
When I estimated the random intercept model using lmer, I can extract a
random intercept for all households, irrespective of their number of plots.
How does lmer treat these households with just 1 plot? Also, is it
theoretically correct to include these observations ?
Thank you,
Regards,
Yashree
    [[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
------------------------------

Message: 5
Date: Mon, 29 Oct 2018 20:58:09 -0300
From: John Wilson <***@gmail.com>
To: r-sig-mixed-***@r-project.org
Subject: [R-sig-ME] zero-inflated glmmTMB with poly() - confidence
    band
Message-ID:
    <CABdA5Q0o2nJTRFo+mj6Hcz_UbZ+***@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hello,

I've been using the newly documented predict() with group = NA to predict
population-level values, as per the thread here (
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q4/027305.html). A
follow-up question: in the case of a zero-inflated model, how would I go
about to get the 95% CIs for response predictions? Obviously, I can get
them for the count and the zero-component, and without poly(), I would just
follow the example in Brooks et al (2017).

However, I have a poly() predictor; how do I get the 95% CIs if I can't use
model.matrix naively when I have a poly() in the model? The models take a
while to converge, so I don't want to run a full bootstrapping either, if
at all possible.

Thank you!
John

Here's a toy dataset:

library(ggplot2)
library(glmmTMB)

set.seed(0)
x <- 1:20
z <- sample(c("a", "b"), length(x), replace = TRUE)
y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
y[c(2, 3, 5, 11, 13, 19)] <- 0
group <- sample(c("i", "ii"), length(x), replace = TRUE)
df <- data.frame(x = x, y = y, z = z, group = group)

ggplot(df) +
geom_point(aes(x = x, y = y, colour = z))

m <- glmmTMB(y ~ poly(x, 3) * z +
(1 | group),
zi = ~ z,
family = nbinom1,
data = df)
# prediction on a new grid
newdata <- expand.grid(x = 1:20, z = unique(df$z), group = NA)
newdata$Pred <- predict(m, type = "response")
### now how to add CIs?

ggplot(df) +
geom_point(aes(x = x, y = y, colour = z)) +
geom_line(data = newdata, aes(x = x, y = Pred, colour = z, group = z), size
= 2) +
facet_wrap(~ z)

    [[alternative HTML version deleted]]




------------------------------

Subject: Digest Footer

_______________________________________________
R-sig-mixed-models mailing list
R-sig-mixed-***@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models


------------------------------

End of R-sig-mixed-models Digest, Vol 142, Issue 28
***************************************************


[[alternative HTML version deleted]]
Ben Bolker
2018-10-30 17:15:22 UTC
Permalink
Don't know about matching SAS, but nlme::lme() with
correlation=corCAR1() or glmmTMB::glmmTMB() with an ou() correlation
structure should both work for accounting for autocorrelation with
unevenly spaced data. Maybe you could let us know how you're analyzing
the data in SAS?
Post by Nik Tuzov
What is the best package to analyze RM data in R, given that time is on continuous scale and the time pointsare not equally spaced. Ideally, that package should match SAS results.
Regards,Nik
Sent: Monday, October 29, 2018 6:58 PM
Subject: R-sig-mixed-models Digest, Vol 142, Issue 28
Send R-sig-mixed-models mailing list submissions to
To subscribe or unsubscribe via the World Wide Web, visit
    https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-sig-mixed-models digest..."
  1. Re: Simulations in Fisher's Exact test (Alexandre Courtiol)
  2. Re: glmmTMB (Ben Bolker)
  3. Re: Random intercept model- unbalanced cluster (Yashree Mehta)
  4. Re: Random intercept model- unbalanced cluster (Ben Bolker)
  5. zero-inflated glmmTMB with poly() - confidence band (John Wilson)
----------------------------------------------------------------------
Message: 1
Date: Mon, 29 Oct 2018 12:11:24 +0100
Subject: Re: [R-sig-ME] Simulations in Fisher's Exact test
Content-Type: text/plain; charset="utf-8"
Message: 2
Date: Sun, 28 Oct 2018 12:18:44 -0400
Subject: Re: [R-sig-ME] Simulations in Fisher's Exact test
Content-Type: text/plain; charset="utf-8"
might be more appropriate.  I am guessing this is an assignment for a
class?  If so, it might be more useful to get clarification from your
instructor or teaching assistant (or a colleague in your class). The
simulate.p.value: a logical indicating whether to compute p-values by
          Monte Carlo simulation, **in larger than 2 by 2 tables**.
Emphasis (**) added.  Since you're using a 2x2 table, I'm guessing that
simulate.p.value has no effect ...  R probably should warn you, but oh
well ..
hi all,
Probably I am posted in wrong mailing list. I am getting a problem in
applying Fisher's exact test. I am applying Fisher's exact test as
  ntable<- array(data = c(3, 1, 8,11), dim = c(2,2))
fisher.test(ntable)
now, I have to repeat the same 10000 times and have to report p-values.
Using the arguments simulate.p.value in the command is producing the same
results
test<-fisher.test(ntable,workspace=20000,simulate.p.value=T,B=10000)
what changes I have to made in my model.
regards
Adeela
      [[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Adeela, with simulate.p.value = FALSE, the Fisher exact test will attempt
to create all possible contingency tables (keeping marginal sum constant)
to compute the p-value of the test. With simulate.p.value = TRUE (when it
has an effect -> see Ben's comment), it will only sample the space of
possible contingency tables. If the number of possible contingency tables
is not too large, there is not need to use simulate.p.value and if it is
lower than the number of table you simulate, then you should obtain nearly
the same results anyhow. In other words, I don't really understand what you
are trying to achieve but a simple call to fisher.test(ntable) should do
the job.
++
Alex
    [[alternative HTML version deleted]]
------------------------------
Message: 2
Date: Mon, 29 Oct 2018 08:48:05 -0400
Subject: Re: [R-sig-ME] glmmTMB
Content-Type: text/plain; charset="utf-8"
  glmmTMB implements REML by treating the fixed effect parameters as
'random variables', i.e. turning on the Laplace approximation
machinery; see http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#reml-for-glmms
for starting points in the literature.
https://github.com/glmmTMB/glmmTMB/blob/master/glmmTMB/R/glmmTMB.R#L191
On Mon, Oct 29, 2018 at 6:26 AM Maira Fatoretto
Hello,
I am using glmmTMb with binomial family. I chose REML=TRUE  because I have
three random effects and I would like to know which is the restriction used
in this  estimation in the package.
The REML = TRUE have better estimation than ML in my case.
Thank you.
Best regards.
--
*Maíra Blumer Fatoretto*
Estatística - Universidade Estadual de Campinas- UNICAMP
Mestra em Ciências(Estatística e Experimentação Agronômica) - ESALQ/USP -
Piracicaba - SP
Doutoranda em Estatística e Experimentação Agronômica
Telefone:(19) 988309481
        [[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
------------------------------
Message: 3
Date: Mon, 29 Oct 2018 22:13:06 +0100
Subject: Re: [R-sig-ME] Random intercept model- unbalanced cluster
Content-Type: text/plain; charset="utf-8"
Or is there an alternative method of modeling this subset of households who
only own one plot?
thank you,
Regards,
Yashree
Hi,
I am working with a random intercept model on a cluster dataset (Repeated
measurements of plots per household). I have the usual "X" vector
of covariates and one id variable which will make up the random
intercept. For example,
Response variable: Production of maize
Covariates: Size, input quantities, soil fertility dummies etc..
ID variable: Household_ID
However, about 40% of the households own one plot. The number of plots per
household ranges from 1 to 13.
When I estimated the random intercept model using lmer, I can extract a
random intercept for all households, irrespective of their number of plots.
How does lmer treat these households with just 1 plot? Also, is it
theoretically correct to include these observations ?
Thank you,
Regards,
Yashree
    [[alternative HTML version deleted]]
------------------------------
Message: 4
Date: Mon, 29 Oct 2018 17:23:16 -0400
Subject: Re: [R-sig-ME] Random intercept model- unbalanced cluster
Content-Type: text/plain; charset="utf-8"
  In principle lme4 shouldn't have problems with a subset of groups that
have only one observation (although clearly the model will get more
fragile/unreliable the less information is available about within vs
among group variation ...).  I'd expect the random effects for groups
with only one observation to be strongly shrunk toward the population
mean ... if in doubt, it can be very useful to simulate a situation
similar to your real data set to see what happens in cases where you
know the real answer ...
Or is there an alternative method of modeling this subset of households who
only own one plot?
thank you,
Regards,
Yashree
Hi,
I am working with a random intercept model on a cluster dataset (Repeated
measurements of plots per household). I have the usual "X" vector
of covariates and one id variable which will make up the random
intercept. For example,
Response variable: Production of maize
Covariates: Size, input quantities, soil fertility dummies etc..
ID variable: Household_ID
However, about 40% of the households own one plot. The number of plots per
household ranges from 1 to 13.
When I estimated the random intercept model using lmer, I can extract a
random intercept for all households, irrespective of their number of plots.
How does lmer treat these households with just 1 plot? Also, is it
theoretically correct to include these observations ?
Thank you,
Regards,
Yashree
    [[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
------------------------------
Message: 5
Date: Mon, 29 Oct 2018 20:58:09 -0300
Subject: [R-sig-ME] zero-inflated glmmTMB with poly() - confidence
    band
Content-Type: text/plain; charset="utf-8"
Hello,
I've been using the newly documented predict() with group = NA to predict
population-level values, as per the thread here (
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q4/027305.html). A
follow-up question: in the case of a zero-inflated model, how would I go
about to get the 95% CIs for response predictions? Obviously, I can get
them for the count and the zero-component, and without poly(), I would just
follow the example in Brooks et al (2017).
However, I have a poly() predictor; how do I get the 95% CIs if I can't use
model.matrix naively when I have a poly() in the model? The models take a
while to converge, so I don't want to run a full bootstrapping either, if
at all possible.
Thank you!
John
library(ggplot2)
library(glmmTMB)
set.seed(0)
x <- 1:20
z <- sample(c("a", "b"), length(x), replace = TRUE)
y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
y[c(2, 3, 5, 11, 13, 19)] <- 0
group <- sample(c("i", "ii"), length(x), replace = TRUE)
df <- data.frame(x = x, y = y, z = z, group = group)
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z))
m <- glmmTMB(y ~ poly(x, 3) * z +
(1 | group),
zi = ~ z,
family = nbinom1,
data = df)
# prediction on a new grid
newdata <- expand.grid(x = 1:20, z = unique(df$z), group = NA)
newdata$Pred <- predict(m, type = "response")
### now how to add CIs?
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z)) +
geom_line(data = newdata, aes(x = x, y = Pred, colour = z, group = z), size
= 2) +
facet_wrap(~ z)
    [[alternative HTML version deleted]]
------------------------------
Subject: Digest Footer
_______________________________________________
R-sig-mixed-models mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
------------------------------
End of R-sig-mixed-models Digest, Vol 142, Issue 28
***************************************************
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Continue reading on narkive:
Loading...