Discussion:
[R-sig-ME] glmmTMB and post hoc testing
Smith, Desmond
2018-07-09 00:03:05 UTC
Permalink
Dear All,

I have previously tested a mixed model using lmer. However, since the dependent variable is count data, which is over-dispersed, it seems that I should use a negative binomial generalized linear mixed model (GLMM), such as glmmTMB.

The full data frame is:


dim(example)

[1] 115 7


Here are the first six rows of the full 115 row data frame:


head(example)
fixed1 fixed2 random nested_random counts id log_total_counts
1 0 0 1 1 643 1 12.89582
1001 0 0 2 6 585 1 13.67509
2001 0 0 3 11 846 1 13.94209
3001 0 0 4 16 755 1 13.93056
4001 0 0 5 21 1428 1 13.65672
5001 0 0 6 26 1566 1 13.64421

str(example)
'data.frame': 115 obs. of 7 variables:
$ fixed1 : num 0 0 0 0 0 0 1 1 1 1 ...
$ fixed2 : num 0 0 0 0 0 0 0 0 0 0 ...
$ random : num 1 2 3 4 5 6 1 2 3 4 ...
$ nested_random : num 1 6 11 16 21 26 2 7 12 17 ...
$ counts : int 643 585 846 755 1428 1566 309 719 1542 1446 ...
$ id : int 1 1 1 1 1 1 1 1 1 1 ...
$ log_total_counts: num 12.9 13.7 13.9 13.9 13.7 ...

There are six values of fixed1, plus a zero:

unique(example$fixed1)

[1] 0 1 2 3 4 6



The mean of the six values of fixed1 is

mean(c(1,2,3,4,6))
[1] 3.2

There are four values of fixed2, including zero:

unique(example$fixed2)

[1] 0 25 75 8

The mean of the four values of fixed1 is
mean(unique(example$fixed2))
[1] 27


Here is the lmer model I used:

library(lme4)

m1 <- lmer(counts ~ fixed1*fixed2 + (1|random/nested_random) + offset(log_total_counts), data = example, verbose=FALSE)

Followed by the use of glht for post-hoc analysis:

library(multcomp)

glht_fixed1 <- glht(m1, linfct = c(
"fixed1 == 0",
"fixed1 + 8*fixed1:fixed2 == 0",
"fixed1 + 25*fixed1:fixed2 == 0",
"fixed1 + 75*fixed1:fixed2 == 0",
"fixed1 + (27)*fixed1:fixed2 == 0"))

glht_fixed2 <- glht(m1, linfct = c(
"fixed2 + 1*fixed1:fixed2 == 0",
"fixed2 + 2*fixed1:fixed2 == 0",
"fixed2 + 3*fixed1:fixed2 == 0",
"fixed2 + 4*fixed1:fixed2 == 0",
"fixed2 + 6*fixed1:fixed2 == 0",
"fixed2 + (3.2)*fixed1:fixed2 == 0"))

glht_omni <- glht(m1)

Here is the corresponding negative binomial glmmTMB model:

library(glmmTMB)

m2 <- glmmTMB(counts ~ fixed1*fixed2 + (1|random/nested_random) + offset(log_total_counts), data = example, verbose=FALSE, family="nbinom2")

According to this suggestion by Ben Bolker (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2017q3/025813.html), the best approach to post hoc testing with glmmTMB is to use lsmeans (or its more recent equivalent, emmeans).

After running

source(system.file("other_methods","lsmeans_methods.R",package="glmmTMB"))

I can use lsmeans on the glmmTMB object. For example,

as.glht(emmeans(m1,~(week + 27*week:conc)))

General Linear Hypotheses

Linear Hypotheses:
Estimate
3.11304347826087, 21 == 0 -8.813

But this does not seem correct. What I cannot figure out, for the life of me, is how to correctly carry over the use of linfct to the lsmeans scenario on the glmmTMB. I have read all the manuals and vignettes until I am blue in face (or it feels that way, at least), but I am still at a loss. In my defense (culpability?) I am a statistical tyro, so many apologies if I am asking a question with very obvious answers here.

The glht software and post hoc testing works easily with the glmmADMB package, but the glmmADMB software is 10x slower than glmmTMB. I need to perform multiple runs of this analysis, each with 300,000 examples of the mixed model, so speed is essential.

Many thanks for your suggestions or help!

Cheers,

Des



.

________________________________

UCLA HEALTH SCIENCES IMPORTANT WARNING: This email (and any attachments) is only intended for the use of the person or entity to which it is addressed, and may contain information that is privileged and confidential. You, the recipient, are obligated to maintain it in a safe, secure and confidential manner. Unauthorized redisclosure or failure to maintain confidentiality may subject you to federal and state penalties. If you are not the intended recipient, please immediately notify us by return email, and delete this message from your computer.

[[alternative HTML version deleted]]
Ben Bolker
2018-07-09 15:51:04 UTC
Permalink
Post by Smith, Desmond
Dear All,
I have previously tested a mixed model using lmer. However, since the
dependent variable is count data, which is over-dispersed, it seems
that I should use a negative binomial generalized linear mixed model
(GLMM), such as glmmTMB.
That is certainly one possibility.
http://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#overdispersion
offers a variety of other alternatives.

I haven't had a chance to look into the details below yet, but you
could try updating to a development branch of `glmmTMB` that has more
integrated emmeans support:

install_github("glmmTMB/glmmTMB/***@effects")

(you would need compilation tools etc. installed). After that check the
vignette - it may not be built/installed properly, but you can find it here:

https://github.com/glmmTMB/glmmTMB/blob/effects/glmmTMB/vignettes/model_evaluation.rmd
Post by Smith, Desmond
dim(example)
[1] 115 7
head(example) fixed1 fixed2 random nested_random counts id
log_total_counts 1 0 0 1 1 643 1
12.89582 1001 0 0 2 6 585 1
13.67509 2001 0 0 3 11 846 1
13.94209 3001 0 0 4 16 755 1
13.93056 4001 0 0 5 21 1428 1
13.65672 5001 0 0 6 26 1566 1
13.64421
str(example) 'data.frame': 115 obs. of 7 variables: $ fixed1
: num 0 0 0 0 0 0 1 1 1 1 ... $ fixed2 : num 0 0 0 0 0 0 0
0 0 0 ... $ random : num 1 2 3 4 5 6 1 2 3 4 ... $
nested_random : num 1 6 11 16 21 26 2 7 12 17 ... $ counts
: int 643 585 846 755 1428 1566 309 719 1542 1446 ... $ id
: int 1 1 1 1 1 1 1 1 1 1 ... $ log_total_counts: num 12.9 13.7
13.9 13.9 13.7 ...
unique(example$fixed1)
[1] 0 1 2 3 4 6
The mean of the six values of fixed1 is
mean(c(1,2,3,4,6)) [1] 3.2
unique(example$fixed2)
[1] 0 25 75 8
The mean of the four values of fixed1 is
mean(unique(example$fixed2)) [1] 27
library(lme4)
m1 <- lmer(counts ~ fixed1*fixed2 + (1|random/nested_random) +
offset(log_total_counts), data = example, verbose=FALSE)
library(multcomp)
glht_fixed1 <- glht(m1, linfct = c( "fixed1 == 0", "fixed1 +
8*fixed1:fixed2 == 0", "fixed1 + 25*fixed1:fixed2 == 0", "fixed1 +
75*fixed1:fixed2 == 0", "fixed1 + (27)*fixed1:fixed2 == 0"))
glht_fixed2 <- glht(m1, linfct = c( "fixed2 + 1*fixed1:fixed2 == 0",
"fixed2 + 2*fixed1:fixed2 == 0", "fixed2 + 3*fixed1:fixed2 == 0",
"fixed2 + 4*fixed1:fixed2 == 0", "fixed2 + 6*fixed1:fixed2 == 0",
"fixed2 + (3.2)*fixed1:fixed2 == 0"))
glht_omni <- glht(m1)
library(glmmTMB)
m2 <- glmmTMB(counts ~ fixed1*fixed2 + (1|random/nested_random) +
offset(log_total_counts), data = example, verbose=FALSE,
family="nbinom2")
According to this suggestion by Ben Bolker
(https://stat.ethz.ch/pipermail/r-sig-mixed-models/2017q3/025813.html),
the best approach to post hoc testing with glmmTMB is to use lsmeans
(or its more recent equivalent, emmeans).
After running
source(system.file("other_methods","lsmeans_methods.R",package="glmmTMB"))
I can use lsmeans on the glmmTMB object. For example,
as.glht(emmeans(m1,~(week + 27*week:conc)))
General Linear Hypotheses
Linear Hypotheses: Estimate 3.11304347826087, 21 == 0 -8.813
But this does not seem correct. What I cannot figure out, for the
life of me, is how to correctly carry over the use of linfct to the
lsmeans scenario on the glmmTMB. I have read all the manuals and
vignettes until I am blue in face (or it feels that way, at least),
but I am still at a loss. In my defense (culpability?) I am a
statistical tyro, so many apologies if I am asking a question with
very obvious answers here.
The glht software and post hoc testing works easily with the glmmADMB
package, but the glmmADMB software is 10x slower than glmmTMB. I need
to perform multiple runs of this analysis, each with 300,000 examples
of the mixed model, so speed is essential.
Many thanks for your suggestions or help!
Cheers,
Des
.
________________________________
UCLA HEALTH SCIENCES IMPORTANT WARNING: This email (and any
attachments) is only intended for the use of the person or entity to
which it is addressed, and may contain information that is privileged
and confidential. You, the recipient, are obligated to maintain it in
a safe, secure and confidential manner. Unauthorized redisclosure or
failure to maintain confidentiality may subject you to federal and
state penalties. If you are not the intended recipient, please
immediately notify us by return email, and delete this message from
your computer.
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Loading...