[R-sig-ME] population-level predict glmmtmb with poly()

Discussion:

John Wilson

2018-10-23 17:46:58 UTC

Hello,

I'm working on a glmmtmb() model with multiple continuous and categorical
predictors. Two of the predictors are orthogonal polynomials (I just saw
that the package was updated yesterday (!) to correctly handle those). One
of the polynomials has an interaction with another covariate.

Since predict(re.form = 0) doesn't work just yet and one has to use
the model.matrix(lme4::nobars(formula(mod1)[-2]), newdata) approach - how
do I get the correct polynomial predictions out? It looks like my results
depend on how I structure the newdata data frame - when I use
expand.grid(), the predictions are wrong, but when I subset the original
data, the predictions are correct.

Thanks so much!
John

Here's an example:

library(ggplot2)
library(glmmTMB)

set.seed(0)
x <- 1:20
z <- sample(c("a", "b"), length(x), replace = TRUE)
y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
group <- sample(c("i", "ii"), length(x), replace = TRUE)
df <- data.frame(x = x, y = y, z = z, group = group)
m <- glmmTMB(y ~ poly(x, 3) * z +
(1 | group),
family = nbinom2,
data = df)
# prediction on a new grid
newdata <- expand.grid(x = 1:20, z = unique(df$z))
X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
beta.cond = fixef(m)$cond
newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
# prediction in original data frame
X.cond = model.matrix(lme4::nobars(formula(m)[-2]))
beta.cond = fixef(m)$cond
df$Pred1 = as.numeric(X.cond %*% beta.cond)

# the newdata preds are obviously off
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z)) +
geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size = 2)
+
geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))

# if the new grid is defined like this, then the predictions are ok
newdata <- unique(select(df, x, z))
X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
beta.cond = fixef(m)$cond
newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z)) +
geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size = 2)
+
geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))

[[alternative HTML version deleted]]

D. Rizopoulos

2018-10-23 19:31:04 UTC

Permalink

You could fit the same model also using the GLMMadaptive package in
which the predict() method can produce predictions for the mean subject
(i.e., the one with random effects values equal to 0), marginal
predictions (i.e., averaged over the subjects), and subject-specific
predictions.

For more info you can have a look at:

https://drizopoulos.github.io/GLMMadaptive/

https://drizopoulos.github.io/GLMMadaptive/articles/Methods_MixMod.html#predictions

and potentially also

https://drizopoulos.github.io/GLMMadaptive/articles/Dynamic_Predictions.html

if you're interested in dynamic predictions.

Best,
Dimitris

On 10/23/2018 7:46 PM, John Wilson wrote:
> Hello,
>
> I'm working on a glmmtmb() model with multiple continuous and categorical
> predictors. Two of the predictors are orthogonal polynomials (I just saw
> that the package was updated yesterday (!) to correctly handle those). One
> of the polynomials has an interaction with another covariate.
>
> Since predict(re.form = 0) doesn't work just yet and one has to use
> the model.matrix(lme4::nobars(formula(mod1)[-2]), newdata) approach - how
> do I get the correct polynomial predictions out? It looks like my results
> depend on how I structure the newdata data frame - when I use
> expand.grid(), the predictions are wrong, but when I subset the original
> data, the predictions are correct.
>
> Thanks so much!
> John
>
> Here's an example:
>
> library(ggplot2)
> library(glmmTMB)
>
> set.seed(0)
> x <- 1:20
> z <- sample(c("a", "b"), length(x), replace = TRUE)
> y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
> group <- sample(c("i", "ii"), length(x), replace = TRUE)
> df <- data.frame(x = x, y = y, z = z, group = group)
> m <- glmmTMB(y ~ poly(x, 3) * z +
> (1 | group),
> family = nbinom2,
> data = df)
> # prediction on a new grid
> newdata <- expand.grid(x = 1:20, z = unique(df$z))
> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
> beta.cond = fixef(m)$cond
> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
> # prediction in original data frame
> X.cond = model.matrix(lme4::nobars(formula(m)[-2]))
> beta.cond = fixef(m)$cond
> df$Pred1 = as.numeric(X.cond %*% beta.cond)
>
> # the newdata preds are obviously off
> ggplot(df) +
> geom_point(aes(x = x, y = y, colour = z)) +
> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size = 2)
> +
> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
>
> # if the new grid is defined like this, then the predictions are ok
> newdata <- unique(select(df, x, z))
> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
> beta.cond = fixef(m)$cond
> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
> ggplot(df) +
> geom_point(aes(x = x, y = y, colour = z)) +
> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size = 2)
> +
> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-***@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

--
Dimitris Rizopoulos
Professor of Biostatistics
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web (personal): http://www.drizopoulos.com/
Web (work): http://www.erasmusmc.nl/biostatistiek/
Blog: http://iprogn.blogspot.nl/

John Wilson

2018-10-23 22:48:54 UTC

Permalink

Thank you, Dimitris. I poked around a bit but didn't see mention of
autocorrelation structures - are those supported as well?

I was wondering if this issue was related to the problem logged here:
https://github.com/glmmTMB/glmmTMB/issues/402. Can anyone comment on
whether what I'm seeing is a bug or just a user (=me) mistake? When I
change the toy example to be a simple linear form (rather than poly()), the
two predictions are identical, so I'm pretty sure it's a poly() issue...

On Tue, Oct 23, 2018 at 4:31 PM D. Rizopoulos <***@erasmusmc.nl>
wrote:

> You could fit the same model also using the GLMMadaptive package in
> which the predict() method can produce predictions for the mean subject
> (i.e., the one with random effects values equal to 0), marginal
> predictions (i.e., averaged over the subjects), and subject-specific
> predictions.
>
> For more info you can have a look at:
>
> https://drizopoulos.github.io/GLMMadaptive/
>
>
> https://drizopoulos.github.io/GLMMadaptive/articles/Methods_MixMod.html#predictions
>
> and potentially also
>
>
> https://drizopoulos.github.io/GLMMadaptive/articles/Dynamic_Predictions.html
>
> if you're interested in dynamic predictions.
>
> Best,
> Dimitris
>
>
>
> On 10/23/2018 7:46 PM, John Wilson wrote:
> > Hello,
> >
> > I'm working on a glmmtmb() model with multiple continuous and categorical
> > predictors. Two of the predictors are orthogonal polynomials (I just saw
> > that the package was updated yesterday (!) to correctly handle those).
> One
> > of the polynomials has an interaction with another covariate.
> >
> > Since predict(re.form = 0) doesn't work just yet and one has to use
> > the model.matrix(lme4::nobars(formula(mod1)[-2]), newdata) approach - how
> > do I get the correct polynomial predictions out? It looks like my results
> > depend on how I structure the newdata data frame - when I use
> > expand.grid(), the predictions are wrong, but when I subset the original
> > data, the predictions are correct.
> >
> > Thanks so much!
> > John
> >
> > Here's an example:
> >
> > library(ggplot2)
> > library(glmmTMB)
> >
> > set.seed(0)
> > x <- 1:20
> > z <- sample(c("a", "b"), length(x), replace = TRUE)
> > y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
> > group <- sample(c("i", "ii"), length(x), replace = TRUE)
> > df <- data.frame(x = x, y = y, z = z, group = group)
> > m <- glmmTMB(y ~ poly(x, 3) * z +
> > (1 | group),
> > family = nbinom2,
> > data = df)
> > # prediction on a new grid
> > newdata <- expand.grid(x = 1:20, z = unique(df$z))
> > X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
> > beta.cond = fixef(m)$cond
> > newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
> > # prediction in original data frame
> > X.cond = model.matrix(lme4::nobars(formula(m)[-2]))
> > beta.cond = fixef(m)$cond
> > df$Pred1 = as.numeric(X.cond %*% beta.cond)
> >
> > # the newdata preds are obviously off
> > ggplot(df) +
> > geom_point(aes(x = x, y = y, colour = z)) +
> > geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size =
> 2)
> > +
> > geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
> >
> > # if the new grid is defined like this, then the predictions are ok
> > newdata <- unique(select(df, x, z))
> > X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
> > beta.cond = fixef(m)$cond
> > newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
> > ggplot(df) +
> > geom_point(aes(x = x, y = y, colour = z)) +
> > geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size =
> 2)
> > +
> > geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-***@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>
> --
> Dimitris Rizopoulos
> Professor of Biostatistics
> Department of Biostatistics
> Erasmus University Medical Center
>
> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> Tel: +31/(0)10/7043478
> Fax: +31/(0)10/7043014
> Web (personal): http://www.drizopoulos.com/
> Web (work): http://www.erasmusmc.nl/biostatistiek/
> Blog: http://iprogn.blogspot.nl/

[[alternative HTML version deleted]]

Ben Bolker

2018-10-23 23:21:47 UTC

Permalink

Naive prediction with data-dependent bases **will not work** with new
input variables. (This is a general R/model.matrix() thing ... not
glmmTMB's fault.) The current development version of glmmTMB does some
magic (see ?makepredictcall,
https://developer.r-project.org/model-fitting-functions.html for more
detail) that makes this work.

It wasn't documented until about 120 seconds ago, but in order to do
population-level predictions with predict() all you need to do is set
the group variable to NA. This has less flexibility than the re.form=
argument in lme4 (which allows you to drop a *subset* of the random
effects terms for a given grouping variables), but it does handle the
most common use case ...

Does this work for you (with devel version of glmmTMB) ?

# prediction on a new grid
newdata <- expand.grid(x = 1:20, z = unique(df$z), group=NA)
newdata$y <- predict(m,newdata=newdata, type="response")

(ggplot(df, aes(x,y, colour=z))
+ geom_point()
+ geom_line(data = newdata, size =2)
)

On 2018-10-23 6:48 p.m., John Wilson wrote:
> Thank you, Dimitris. I poked around a bit but didn't see mention of
> autocorrelation structures - are those supported as well?
>
> I was wondering if this issue was related to the problem logged here:
> https://github.com/glmmTMB/glmmTMB/issues/402. Can anyone comment on
> whether what I'm seeing is a bug or just a user (=me) mistake? When I
> change the toy example to be a simple linear form (rather than poly()), the
> two predictions are identical, so I'm pretty sure it's a poly() issue...
>
> On Tue, Oct 23, 2018 at 4:31 PM D. Rizopoulos <***@erasmusmc.nl>
> wrote:
>
>> You could fit the same model also using the GLMMadaptive package in
>> which the predict() method can produce predictions for the mean subject
>> (i.e., the one with random effects values equal to 0), marginal
>> predictions (i.e., averaged over the subjects), and subject-specific
>> predictions.
>>
>> For more info you can have a look at:
>>
>> https://drizopoulos.github.io/GLMMadaptive/
>>
>>
>> https://drizopoulos.github.io/GLMMadaptive/articles/Methods_MixMod.html#predictions
>>
>> and potentially also
>>
>>
>> https://drizopoulos.github.io/GLMMadaptive/articles/Dynamic_Predictions.html
>>
>> if you're interested in dynamic predictions.
>>
>> Best,
>> Dimitris
>>
>>
>>
>> On 10/23/2018 7:46 PM, John Wilson wrote:
>>> Hello,
>>>
>>> I'm working on a glmmtmb() model with multiple continuous and categorical
>>> predictors. Two of the predictors are orthogonal polynomials (I just saw
>>> that the package was updated yesterday (!) to correctly handle those).
>> One
>>> of the polynomials has an interaction with another covariate.
>>>
>>> Since predict(re.form = 0) doesn't work just yet and one has to use
>>> the model.matrix(lme4::nobars(formula(mod1)[-2]), newdata) approach - how
>>> do I get the correct polynomial predictions out? It looks like my results
>>> depend on how I structure the newdata data frame - when I use
>>> expand.grid(), the predictions are wrong, but when I subset the original
>>> data, the predictions are correct.
>>>
>>> Thanks so much!
>>> John
>>>
>>> Here's an example:
>>>
>>> library(ggplot2)
>>> library(glmmTMB)
>>>
>>> set.seed(0)
>>> x <- 1:20
>>> z <- sample(c("a", "b"), length(x), replace = TRUE)
>>> y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
>>> group <- sample(c("i", "ii"), length(x), replace = TRUE)
>>> df <- data.frame(x = x, y = y, z = z, group = group)
>>> m <- glmmTMB(y ~ poly(x, 3) * z +
>>> (1 | group),
>>> family = nbinom2,
>>> data = df)
>>> # prediction on a new grid
>>> newdata <- expand.grid(x = 1:20, z = unique(df$z))
>>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
>>> beta.cond = fixef(m)$cond
>>> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
>>> # prediction in original data frame
>>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]))
>>> beta.cond = fixef(m)$cond
>>> df$Pred1 = as.numeric(X.cond %*% beta.cond)
>>>
>>> # the newdata preds are obviously off
>>> ggplot(df) +
>>> geom_point(aes(x = x, y = y, colour = z)) +
>>> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size =
>> 2)
>>> +
>>> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
>>>
>>> # if the new grid is defined like this, then the predictions are ok
>>> newdata <- unique(select(df, x, z))
>>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
>>> beta.cond = fixef(m)$cond
>>> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
>>> ggplot(df) +
>>> geom_point(aes(x = x, y = y, colour = z)) +
>>> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size =
>> 2)
>>> +
>>> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-mixed-***@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>
>>
>> --
>> Dimitris Rizopoulos
>> Professor of Biostatistics
>> Department of Biostatistics
>> Erasmus University Medical Center
>>
>> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
>> Tel: +31/(0)10/7043478
>> Fax: +31/(0)10/7043014
>> Web (personal): http://www.drizopoulos.com/
>> Web (work): http://www.erasmusmc.nl/biostatistiek/
>> Blog: http://iprogn.blogspot.nl/
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-***@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

John Wilson

2018-10-24 14:24:32 UTC

Permalink

So far it's looking just grand! +1 for "it wasn't documented until about
120 seconds ago".
Just to push my luck here - in my full model, I have autocorrelation
structures. When I only set group = NA, the predict() function gave me an
error due to lack of the autocorrelation variable in the new data. I then
crossed my fingers and did the same thing as for group - a column with NAs.
The predict() function had no problem with that. However, I wanted to see
if that's a (semi) legit way to doing it.

On Tue, Oct 23, 2018 at 8:22 PM Ben Bolker <***@gmail.com> wrote:

>
> Naive prediction with data-dependent bases **will not work** with new
> input variables. (This is a general R/model.matrix() thing ... not
> glmmTMB's fault.) The current development version of glmmTMB does some
> magic (see ?makepredictcall,
> https://developer.r-project.org/model-fitting-functions.html for more
> detail) that makes this work.
>
> It wasn't documented until about 120 seconds ago, but in order to do
> population-level predictions with predict() all you need to do is set
> the group variable to NA. This has less flexibility than the re.form=
> argument in lme4 (which allows you to drop a *subset* of the random
> effects terms for a given grouping variables), but it does handle the
> most common use case ...
>
> Does this work for you (with devel version of glmmTMB) ?
>
> # prediction on a new grid
> newdata <- expand.grid(x = 1:20, z = unique(df$z), group=NA)
> newdata$y <- predict(m,newdata=newdata, type="response")
>
> (ggplot(df, aes(x,y, colour=z))
> + geom_point()
> + geom_line(data = newdata, size =2)
> )
>
>
> On 2018-10-23 6:48 p.m., John Wilson wrote:
> > Thank you, Dimitris. I poked around a bit but didn't see mention of
> > autocorrelation structures - are those supported as well?
> >
> > I was wondering if this issue was related to the problem logged here:
> > https://github.com/glmmTMB/glmmTMB/issues/402. Can anyone comment on
> > whether what I'm seeing is a bug or just a user (=me) mistake? When I
> > change the toy example to be a simple linear form (rather than poly()),
> the
> > two predictions are identical, so I'm pretty sure it's a poly() issue...
> >
> > On Tue, Oct 23, 2018 at 4:31 PM D. Rizopoulos <***@erasmusmc.nl
> >
> > wrote:
> >
> >> You could fit the same model also using the GLMMadaptive package in
> >> which the predict() method can produce predictions for the mean subject
> >> (i.e., the one with random effects values equal to 0), marginal
> >> predictions (i.e., averaged over the subjects), and subject-specific
> >> predictions.
> >>
> >> For more info you can have a look at:
> >>
> >> https://drizopoulos.github.io/GLMMadaptive/
> >>
> >>
> >>
> https://drizopoulos.github.io/GLMMadaptive/articles/Methods_MixMod.html#predictions
> >>
> >> and potentially also
> >>
> >>
> >>
> https://drizopoulos.github.io/GLMMadaptive/articles/Dynamic_Predictions.html
> >>
> >> if you're interested in dynamic predictions.
> >>
> >> Best,
> >> Dimitris
> >>
> >>
> >>
> >> On 10/23/2018 7:46 PM, John Wilson wrote:
> >>> Hello,
> >>>
> >>> I'm working on a glmmtmb() model with multiple continuous and
> categorical
> >>> predictors. Two of the predictors are orthogonal polynomials (I just
> saw
> >>> that the package was updated yesterday (!) to correctly handle those).
> >> One
> >>> of the polynomials has an interaction with another covariate.
> >>>
> >>> Since predict(re.form = 0) doesn't work just yet and one has to use
> >>> the model.matrix(lme4::nobars(formula(mod1)[-2]), newdata) approach -
> how
> >>> do I get the correct polynomial predictions out? It looks like my
> results
> >>> depend on how I structure the newdata data frame - when I use
> >>> expand.grid(), the predictions are wrong, but when I subset the
> original
> >>> data, the predictions are correct.
> >>>
> >>> Thanks so much!
> >>> John
> >>>
> >>> Here's an example:
> >>>
> >>> library(ggplot2)
> >>> library(glmmTMB)
> >>>
> >>> set.seed(0)
> >>> x <- 1:20
> >>> z <- sample(c("a", "b"), length(x), replace = TRUE)
> >>> y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10,
> 0.03))
> >>> group <- sample(c("i", "ii"), length(x), replace = TRUE)
> >>> df <- data.frame(x = x, y = y, z = z, group = group)
> >>> m <- glmmTMB(y ~ poly(x, 3) * z +
> >>> (1 | group),
> >>> family = nbinom2,
> >>> data = df)
> >>> # prediction on a new grid
> >>> newdata <- expand.grid(x = 1:20, z = unique(df$z))
> >>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
> >>> beta.cond = fixef(m)$cond
> >>> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
> >>> # prediction in original data frame
> >>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]))
> >>> beta.cond = fixef(m)$cond
> >>> df$Pred1 = as.numeric(X.cond %*% beta.cond)
> >>>
> >>> # the newdata preds are obviously off
> >>> ggplot(df) +
> >>> geom_point(aes(x = x, y = y, colour = z)) +
> >>> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size
> =
> >> 2)
> >>> +
> >>> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
> >>>
> >>> # if the new grid is defined like this, then the predictions are ok
> >>> newdata <- unique(select(df, x, z))
> >>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
> >>> beta.cond = fixef(m)$cond
> >>> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
> >>> ggplot(df) +
> >>> geom_point(aes(x = x, y = y, colour = z)) +
> >>> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size
> =
> >> 2)
> >>> +
> >>> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> R-sig-mixed-***@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>
> >>
> >> --
> >> Dimitris Rizopoulos
> >> Professor of Biostatistics
> >> Department of Biostatistics
> >> Erasmus University Medical Center
> >>
> >> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> >> Tel: +31/(0)10/7043478
> >> Fax: +31/(0)10/7043014
> >> Web (personal): http://www.drizopoulos.com/
> >> Web (work): http://www.erasmusmc.nl/biostatistiek/
> >> Blog: http://iprogn.blogspot.nl/
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-***@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>
> _______________________________________________
> R-sig-mixed-***@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

[[alternative HTML version deleted]]

Ben Bolker

2018-10-24 14:51:02 UTC

Permalink

On 2018-10-24 10:24 a.m., John Wilson wrote:
> So far it's looking just grand! +1 for "it wasn't documented until
> about 120 seconds ago".
> Just to push my luck here - in my full model, I have autocorrelation
> structures. When I only set group = NA, the predict() function gave me
> an error due to lack of the autocorrelation variable in the new data. I
> then crossed my fingers and did the same thing as for group - a column
> with NAs. The predict() function had no problem with that. However, I
> wanted to see if that's a (semi) legit way to doing it.

Seems reasonable (but I can't give any guarantees without spending a
lot more time looking); I would try some simple tests and proceed if the
answers look OK.

>
> On Tue, Oct 23, 2018 at 8:22 PM Ben Bolker <***@gmail.com
> <mailto:***@gmail.com>> wrote:
>
>
> Naive prediction with data-dependent bases **will not work** with new
> input variables. (This is a general R/model.matrix() thing ... not
> glmmTMB's fault.) The current development version of glmmTMB does some
> magic (see ?makepredictcall,
> https://developer.r-project.org/model-fitting-functions.html for more
> detail) that makes this work.
>
> It wasn't documented until about 120 seconds ago, but in order to do
> population-level predictions with predict() all you need to do is set
> the group variable to NA. This has less flexibility than the re.form=
> argument in lme4 (which allows you to drop a *subset* of the random
> effects terms for a given grouping variables), but it does handle the
> most common use case ...
>
> Does this work for you (with devel version of glmmTMB) ?
>
> # prediction on a new grid
> newdata <- expand.grid(x = 1:20, z = unique(df$z), group=NA)
> newdata$y <- predict(m,newdata=newdata, type="response")
>
> (ggplot(df, aes(x,y, colour=z))
> + geom_point()
> + geom_line(data = newdata, size =2)
> )
>
>
> On 2018-10-23 6:48 p.m., John Wilson wrote:
> > Thank you, Dimitris. I poked around a bit but didn't see mention of
> > autocorrelation structures - are those supported as well?
> >
> > I was wondering if this issue was related to the problem logged here:
> > https://github.com/glmmTMB/glmmTMB/issues/402. Can anyone comment on
> > whether what I'm seeing is a bug or just a user (=me) mistake? When I
> > change the toy example to be a simple linear form (rather than
> poly()), the
> > two predictions are identical, so I'm pretty sure it's a poly()
> issue...
> >
> > On Tue, Oct 23, 2018 at 4:31 PM D. Rizopoulos
> <***@erasmusmc.nl <mailto:***@erasmusmc.nl>>
> > wrote:
> >
> >> You could fit the same model also using the GLMMadaptive package in
> >> which the predict() method can produce predictions for the mean
> subject
> >> (i.e., the one with random effects values equal to 0), marginal
> >> predictions (i.e., averaged over the subjects), and subject-specific
> >> predictions.
> >>
> >> For more info you can have a look at:
> >>
> >> https://drizopoulos.github.io/GLMMadaptive/
> >>
> >>
> >>
> https://drizopoulos.github.io/GLMMadaptive/articles/Methods_MixMod.html#predictions
> >>
> >> and potentially also
> >>
> >>
> >>
> https://drizopoulos.github.io/GLMMadaptive/articles/Dynamic_Predictions.html
> >>
> >> if you're interested in dynamic predictions.
> >>
> >> Best,
> >> Dimitris
> >>
> >>
> >>
> >> On 10/23/2018 7:46 PM, John Wilson wrote:
> >>> Hello,
> >>>
> >>> I'm working on a glmmtmb() model with multiple continuous and
> categorical
> >>> predictors. Two of the predictors are orthogonal polynomials (I
> just saw
> >>> that the package was updated yesterday (!) to correctly handle
> those).
> >> One
> >>> of the polynomials has an interaction with another covariate.
> >>>
> >>> Since predict(re.form = 0) doesn't work just yet and one has to use
> >>> the model.matrix(lme4::nobars(formula(mod1)[-2]), newdata)
> approach - how
> >>> do I get the correct polynomial predictions out? It looks like
> my results
> >>> depend on how I structure the newdata data frame - when I use
> >>> expand.grid(), the predictions are wrong, but when I subset the
> original
> >>> data, the predictions are correct.
> >>>
> >>> Thanks so much!
> >>> John
> >>>
> >>> Here's an example:
> >>>
> >>> library(ggplot2)
> >>> library(glmmTMB)
> >>>
> >>> set.seed(0)
> >>> x <- 1:20
> >>> z <- sample(c("a", "b"), length(x), replace = TRUE)
> >>> y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x),
> 10, 0.03))
> >>> group <- sample(c("i", "ii"), length(x), replace = TRUE)
> >>> df <- data.frame(x = x, y = y, z = z, group = group)
> >>> m <- glmmTMB(y ~ poly(x, 3) * z +
> >>> (1 | group),
> >>> family = nbinom2,
> >>> data = df)
> >>> # prediction on a new grid
> >>> newdata <- expand.grid(x = 1:20, z = unique(df$z))
> >>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
> >>> beta.cond = fixef(m)$cond
> >>> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
> >>> # prediction in original data frame
> >>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]))
> >>> beta.cond = fixef(m)$cond
> >>> df$Pred1 = as.numeric(X.cond %*% beta.cond)
> >>>
> >>> # the newdata preds are obviously off
> >>> ggplot(df) +
> >>> geom_point(aes(x = x, y = y, colour = z)) +
> >>> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour =
> z), size =
> >> 2)
> >>> +
> >>> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
> >>>
> >>> # if the new grid is defined like this, then the predictions are ok
> >>> newdata <- unique(select(df, x, z))
> >>> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
> >>> beta.cond = fixef(m)$cond
> >>> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
> >>> ggplot(df) +
> >>> geom_point(aes(x = x, y = y, colour = z)) +
> >>> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour =
> z), size =
> >> 2)
> >>> +
> >>> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> R-sig-mixed-***@r-project.org
> <mailto:R-sig-mixed-***@r-project.org> mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>
> >>
> >> --
> >> Dimitris Rizopoulos
> >> Professor of Biostatistics
> >> Department of Biostatistics
> >> Erasmus University Medical Center
> >>
> >> Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
> >> Tel: +31/(0)10/7043478
> >> Fax: +31/(0)10/7043014
> >> Web (personal): http://www.drizopoulos.com/
> >> Web (work): http://www.erasmusmc.nl/biostatistiek/
> >> Blog: http://iprogn.blogspot.nl/
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-***@r-project.org
> <mailto:R-sig-mixed-***@r-project.org> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>
> _______________________________________________
> R-sig-mixed-***@r-project.org
> <mailto:R-sig-mixed-***@r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

D. Rizopoulos

2018-10-24 02:16:57 UTC

Permalink

Indeed, currently in GLMMadaptive for the random effects you can assume an unstructured or a diagonal covariance matrix. In my future plans is to also include the compound symmetric one.

However, in general, for categorical multivariate outcome data Y, note that assuming a particular covariance structure for the unobserved random effects b, does *not* translate to the same covariance structure for the marginal distribution of the observed Y. This is because of the nonlinear link function used, and the fact that the marginal distribution p(Y) = \int p(Y | b) p(b) db doesnt even have a closed-form. Hence, the estimated parameters from such a covariance structure for the random effects in this case do not have any directly meaningful interpretation.

Best,
Dimitris

From: John Wilson <***@gmail.com<mailto:***@gmail.com>>
Date: Wednesday, 24 Oct 2018, 12:48 AM
To: D. Rizopoulos <***@erasmusmc.nl<mailto:***@erasmusmc.nl>>
Cc: r-sig-mixed-***@r-project.org <r-sig-mixed-***@r-project.org<mailto:r-sig-mixed-***@r-project.org>>
Subject: Re: [R-sig-ME] population-level predict glmmtmb with poly()

Thank you, Dimitris. I poked around a bit but didn't see mention of autocorrelation structures - are those supported as well?

I was wondering if this issue was related to the problem logged here: https://github.com/glmmTMB/glmmTMB/issues/402. Can anyone comment on whether what I'm seeing is a bug or just a user (=me) mistake? When I change the toy example to be a simple linear form (rather than poly()), the two predictions are identical, so I'm pretty sure it's a poly() issue...

On Tue, Oct 23, 2018 at 4:31 PM D. Rizopoulos <***@erasmusmc.nl<mailto:***@erasmusmc.nl>> wrote:
You could fit the same model also using the GLMMadaptive package in
which the predict() method can produce predictions for the mean subject
(i.e., the one with random effects values equal to 0), marginal
predictions (i.e., averaged over the subjects), and subject-specific
predictions.

For more info you can have a look at:

https://drizopoulos.github.io/GLMMadaptive/

https://drizopoulos.github.io/GLMMadaptive/articles/Methods_MixMod.html#predictions

and potentially also

https://drizopoulos.github.io/GLMMadaptive/articles/Dynamic_Predictions.html

if you're interested in dynamic predictions.

Best,
Dimitris

On 10/23/2018 7:46 PM, John Wilson wrote:
> Hello,
>
> I'm working on a glmmtmb() model with multiple continuous and categorical
> predictors. Two of the predictors are orthogonal polynomials (I just saw
> that the package was updated yesterday (!) to correctly handle those). One
> of the polynomials has an interaction with another covariate.
>
> Since predict(re.form = 0) doesn't work just yet and one has to use
> the model.matrix(lme4::nobars(formula(mod1)[-2]), newdata) approach - how
> do I get the correct polynomial predictions out? It looks like my results
> depend on how I structure the newdata data frame - when I use
> expand.grid(), the predictions are wrong, but when I subset the original
> data, the predictions are correct.
>
> Thanks so much!
> John
>
> Here's an example:
>
> library(ggplot2)
> library(glmmTMB)
>
> set.seed(0)
> x <- 1:20
> z <- sample(c("a", "b"), length(x), replace = TRUE)
> y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
> group <- sample(c("i", "ii"), length(x), replace = TRUE)
> df <- data.frame(x = x, y = y, z = z, group = group)
> m <- glmmTMB(y ~ poly(x, 3) * z +
> (1 | group),
> family = nbinom2,
> data = df)
> # prediction on a new grid
> newdata <- expand.grid(x = 1:20, z = unique(df$z))
> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
> beta.cond = fixef(m)$cond
> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
> # prediction in original data frame
> X.cond = model.matrix(lme4::nobars(formula(m)[-2]))
> beta.cond = fixef(m)$cond
> df$Pred1 = as.numeric(X.cond %*% beta.cond)
>
> # the newdata preds are obviously off
> ggplot(df) +
> geom_point(aes(x = x, y = y, colour = z)) +
> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size = 2)
> +
> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
>
> # if the new grid is defined like this, then the predictions are ok
> newdata <- unique(select(df, x, z))
> X.cond = model.matrix(lme4::nobars(formula(m)[-2]), newdata)
> beta.cond = fixef(m)$cond
> newdata$Pred1 = as.numeric(X.cond %*% beta.cond)
> ggplot(df) +
> geom_point(aes(x = x, y = y, colour = z)) +
> geom_line(data = newdata, aes(x = x, y = exp(Pred1), colour = z), size = 2)
> +
> geom_line(aes(x = x, y = exp(Pred1), colour = z, linetype = group))
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-***@r-project.org<mailto:R-sig-mixed-***@r-project.org> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>

--
Dimitris Rizopoulos
Professor of Biostatistics
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web (personal): http://www.drizopoulos.com/
Web (work): http://www.erasmusmc.nl/biostatistiek/
Blog: http://iprogn.blogspot.nl/

[[alternative HTML version deleted]]