Discussion:
[R-sig-ME] zero-inflated glmmTMB with poly() - confidence band
John Wilson
2018-10-29 23:58:09 UTC
Permalink
Hello,

I've been using the newly documented predict() with group = NA to predict
population-level values, as per the thread here (
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q4/027305.html). A
follow-up question: in the case of a zero-inflated model, how would I go
about to get the 95% CIs for response predictions? Obviously, I can get
them for the count and the zero-component, and without poly(), I would just
follow the example in Brooks et al (2017).

However, I have a poly() predictor; how do I get the 95% CIs if I can't use
model.matrix naively when I have a poly() in the model? The models take a
while to converge, so I don't want to run a full bootstrapping either, if
at all possible.

Thank you!
John

Here's a toy dataset:

library(ggplot2)
library(glmmTMB)

set.seed(0)
x <- 1:20
z <- sample(c("a", "b"), length(x), replace = TRUE)
y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
y[c(2, 3, 5, 11, 13, 19)] <- 0
group <- sample(c("i", "ii"), length(x), replace = TRUE)
df <- data.frame(x = x, y = y, z = z, group = group)

ggplot(df) +
geom_point(aes(x = x, y = y, colour = z))

m <- glmmTMB(y ~ poly(x, 3) * z +
(1 | group),
zi = ~ z,
family = nbinom1,
data = df)
# prediction on a new grid
newdata <- expand.grid(x = 1:20, z = unique(df$z), group = NA)
newdata$Pred <- predict(m, type = "response")
### now how to add CIs?

ggplot(df) +
geom_point(aes(x = x, y = y, colour = z)) +
geom_line(data = newdata, aes(x = x, y = Pred, colour = z, group = z), size
= 2) +
facet_wrap(~ z)

[[alternative HTML version deleted]]
D. Rizopoulos
2018-10-30 08:37:09 UTC
Permalink
You can use still compute the correct design matrix when using poly() on
newdata by working with a terms objects that has an appropriately
defined the 'predvars' attribute. For an example, check the following:

# data and new data for prediction
DF <- data.frame(y = rnorm(10), x = rnorm(10))
newDF <- data.frame(x = rnorm(10))

# *wrong* design matrix X on new data
X_new1 <- model.matrix(~ poly(x, 3), data = newDF)

# correct design matrix on new data
termsX <- terms(model.frame(y ~ poly(x, 3), data = DF))
# check predvars attribute
attr(termsX, "predvars")
X_new2 <- model.matrix(delete.response(termsX), data = newDF)

head(X_new1)
head(X_new2)

Best,
Dimitris
Post by John Wilson
Hello,
I've been using the newly documented predict() with group = NA to predict
population-level values, as per the thread here (
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q4/027305.html). A
follow-up question: in the case of a zero-inflated model, how would I go
about to get the 95% CIs for response predictions? Obviously, I can get
them for the count and the zero-component, and without poly(), I would just
follow the example in Brooks et al (2017).
However, I have a poly() predictor; how do I get the 95% CIs if I can't use
model.matrix naively when I have a poly() in the model? The models take a
while to converge, so I don't want to run a full bootstrapping either, if
at all possible.
Thank you!
John
library(ggplot2)
library(glmmTMB)
set.seed(0)
x <- 1:20
z <- sample(c("a", "b"), length(x), replace = TRUE)
y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
y[c(2, 3, 5, 11, 13, 19)] <- 0
group <- sample(c("i", "ii"), length(x), replace = TRUE)
df <- data.frame(x = x, y = y, z = z, group = group)
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z))
m <- glmmTMB(y ~ poly(x, 3) * z +
(1 | group),
zi = ~ z,
family = nbinom1,
data = df)
# prediction on a new grid
newdata <- expand.grid(x = 1:20, z = unique(df$z), group = NA)
newdata$Pred <- predict(m, type = "response")
### now how to add CIs?
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z)) +
geom_line(data = newdata, aes(x = x, y = Pred, colour = z, group = z), size
= 2) +
facet_wrap(~ z)
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Dimitris Rizopoulos
Professor of Biostatistics
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web (personal): http://www.drizopoulos.com/
Web (work): http://www.erasmusmc.nl/biostatistiek/
Blog: http://iprogn.blogspot.nl/
John Wilson
2018-11-05 12:19:12 UTC
Permalink
This worked like a charm, thank you for setting me straight!
Post by D. Rizopoulos
You can use still compute the correct design matrix when using poly() on
newdata by working with a terms objects that has an appropriately
# data and new data for prediction
DF <- data.frame(y = rnorm(10), x = rnorm(10))
newDF <- data.frame(x = rnorm(10))
# *wrong* design matrix X on new data
X_new1 <- model.matrix(~ poly(x, 3), data = newDF)
# correct design matrix on new data
termsX <- terms(model.frame(y ~ poly(x, 3), data = DF))
# check predvars attribute
attr(termsX, "predvars")
X_new2 <- model.matrix(delete.response(termsX), data = newDF)
head(X_new1)
head(X_new2)
Best,
Dimitris
Post by John Wilson
Hello,
I've been using the newly documented predict() with group = NA to predict
population-level values, as per the thread here (
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2018q4/027305.html). A
follow-up question: in the case of a zero-inflated model, how would I go
about to get the 95% CIs for response predictions? Obviously, I can get
them for the count and the zero-component, and without poly(), I would
just
Post by John Wilson
follow the example in Brooks et al (2017).
However, I have a poly() predictor; how do I get the 95% CIs if I can't
use
Post by John Wilson
model.matrix naively when I have a poly() in the model? The models take a
while to converge, so I don't want to run a full bootstrapping either, if
at all possible.
Thank you!
John
library(ggplot2)
library(glmmTMB)
set.seed(0)
x <- 1:20
z <- sample(c("a", "b"), length(x), replace = TRUE)
y <- round(5 * 2*x + 3 * x^2 + 0.1 * x^3 + rnbinom(length(x), 10, 0.03))
y[c(2, 3, 5, 11, 13, 19)] <- 0
group <- sample(c("i", "ii"), length(x), replace = TRUE)
df <- data.frame(x = x, y = y, z = z, group = group)
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z))
m <- glmmTMB(y ~ poly(x, 3) * z +
(1 | group),
zi = ~ z,
family = nbinom1,
data = df)
# prediction on a new grid
newdata <- expand.grid(x = 1:20, z = unique(df$z), group = NA)
newdata$Pred <- predict(m, type = "response")
### now how to add CIs?
ggplot(df) +
geom_point(aes(x = x, y = y, colour = z)) +
geom_line(data = newdata, aes(x = x, y = Pred, colour = z, group = z),
size
Post by John Wilson
= 2) +
facet_wrap(~ z)
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Dimitris Rizopoulos
Professor of Biostatistics
Department of Biostatistics
Erasmus University Medical Center
Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web (personal): http://www.drizopoulos.com/
Web (work): http://www.erasmusmc.nl/biostatistiek/
Blog: http://iprogn.blogspot.nl/
[[alternative HTML version deleted]]

Loading...