[R-sig-ME] standardized coefficients in glmer model

Discussion:

Leeuwen, Casper van

2010-12-11 07:21:47 UTC

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20101211/03ee22d5/attachment.pl>

Ben Bolker

2010-12-11 16:09:12 UTC

Permalink

Dear R-list,
I'm running a mixed effect logistic regression with both factors and
covariates, an interaction and a random factor.
model <- glmer (intact_binomial ~ species + sex + retention_time +
body_mass + body_mass * retention_time + (1 | individual) , family =
binomial (link = "logit") ) summary(model)
summary() returns effects sizes given as coefficients of the
different factors. However, I would like to indicate the importance
of the different terms in the model, to determine the relative
importance of for instance sex versus body_mass: which one is more
important in explaining my dependent variable?

If all your variables were numeric (which sex is not) then

model_scaled <- glmer(...,data=scale(mydata))

would work: looking at lm.beta and Make.Z in the QuantPsyc package (you
didn't tell us where lm.beta() came from ...), Make.Z seems (as far as I
can tell) to replicate the behavior of the built-in scale() function.
But that approach won't work properly for factors with more than two
levels ...

Here's lm.beta:

lm.beta
function (MOD)
{
b <- summary(MOD)$coef[-1, 1]
sx <- sd(MOD$model[-1])
sy <- sd(MOD$model[1])
beta <- b * sx/sy
return(beta)
}

Here's a translation into lmer-land:

lm.beta.lmer <- function(mod) {
b <- fixef(mod)[-1] ## fixed-effect coefs, sans intercept
sd.x <- apply(mod at X[,-1],2,sd) ## pull out model (design) matrix,
## drop intercept column, calculate
## sd of remaining columns
sd.y <- sd(mod at y) ## sd of response
b*sd.x/sd.y
}

Here's an example, using the Orthodont data from the nlme package:

library(nlme)
data(Orthodont)
dat <- as.data.frame(Orthodont)
detach(package:nlme)

library(lme4)
fm2 <- lmer(distance ~ age + Sex + (age|Subject), data = dat)
lm.beta.lmer(fm2)

For this example (which like yours has Sex, a two-level factor, as its
only non-numeric predictor) we can show that we get the same answer (up
to numeric fuzz) by scale()ing:

pdat <- with(dat,cbind(distance,age,s=as.numeric(Sex)))
pdat <- scale(pdat)
dat2 <- data.frame(pdat,Subject=dat$Subject)

fm3 <- lmer(distance ~ age + s + (age|Subject), data = dat2)
fixef(fm3)

The only remaining question I have is whether it makes sense to scale
by sd(y) in this case -- may not generalize to the GLM case from the LM
case? But you should have the correct *relative* magnitudes of
parameters in any case.

good luck,
Ben Bolker

David Duffy

2010-12-11 20:29:23 UTC

Permalink

model <- glmer (intact_binomial ~
species
+ sex
+ retention_time
+ body_mass
+ body_mass * retention_time
+ (1 | individual)
, family = binomial (link = "logit")
)
summary(model)
summary() returns effects sizes given as coefficients of the different
factors. However, I would like to indicate the importance of the
different terms in the model, to determine the relative importance of
for instance sex versus body_mass: which one is more important in
explaining my dependent variable?

Ben Bolker

2010-12-12 02:16:34 UTC

Permalink

Post by David Duffy

I think the OP is not asking for a pseudo-R^2 or summary of overall
goodness of fit, but standardized regression coefficients ...
given that "species" is one of the predictors, I bet it's not human data.

Leeuwen, Casper van

2010-12-12 02:18:47 UTC

Permalink

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20101212/6316b57c/attachment.pl>

Ben Bolker

2010-12-12 02:56:46 UTC

Permalink

What I meant is that in the linear model case, what you get when you
calculate the standardized regression coefficients is expected change in
(y/(sd y)) per unit change in (x/(sd x)). With GLM you don't get this,
because of the link function. The natural analogue would (I think) be
expected change in (link(y)/sd(link(y))), or something like that
[because the assumption is that the relationship is linear on the linear
predictor scale], but naively calculating sd(link(y)) won't work,
because link(y) is often infinite.

A bit of googling suggests (as with many other times when one wants to
generalize from linear models to GLMs -- e.g. R^2 values) that the
answer is not obvious ...

<http://goliath.ecnext.com/coms2/gi_0199-762729/Six-approaches-to-calculating-standardized.html>
<http://www.nd.edu/~rwilliam/stats3/L06.pdf>

Again, for your purposes I think (?) the most important thing is that
you have the scales of the betas standardized correctly with respect to
each other (as opposed to standardizing the response variable), which
isn't a problem.

Dear list, David and Ben,
Thanks so much for the awesome function: that was exacty what I was
looking for, and it works perfectly on my dataset. Even including
estimate for the interactions.
However, I'm sorry but don't understand your remark on the scaling of
sd(y) "may not generalize to the GLM case from the LM case?". I think it
doesn't matter to scale the y-values for my x-estimates, but do you mean
this would be different for a GLM model than for a LM model? Do you
think the scaling of the y-values is incorrect if the regression is
non-linear?
Thanks a lot for the suggestion, my data is not human but birds body
mass, essentially the same but no BMI. If I understand you correctly,
you say it doesn't make sense to compare estimates between a binomial
term (sex) and a (continuous) covariate (body mass)?
Should I somehow construct a binomial variable from the body mass to be
able to compare the estimates?
Thanks,
Casper
------------------------------------------------------------------------
*From:* David Duffy [mailto:davidD at qimr.edu.au]
*Sent:* Sat 12/11/2010 21:29
*To:* Leeuwen, Casper van
*Cc:* r-sig-mixed-models at r-project.org
*Subject:* Re: [R-sig-ME] standardized coefficients in glmer model

Given this is a logistic regression, there are various more or less
unsatisfactory equivalents of an R2. You might be better off just
comparing effect sizes eg odds ratio (exp(beta)) for sex versus that for
the difference between the first and third quartiles of BMI or
from say BMI=20 to BMI=25 and BMI=30, presuming this is human data.
--
| David Duffy (MBBS PhD) ,-_|\
| email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / *
| Epidemiology Unit, Queensland Institute of Medical Research \_,-._/
| 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
------------------------------------------------------------------------
*From:* Ben Bolker [mailto:bbolker at gmail.com]
*Sent:* Sat 12/11/2010 17:09
*To:* Leeuwen, Casper van; r-sig-mixed-models at r-project.org
*Subject:* Re: [R-sig-ME] standardized coefficients in glmer model

If all your variables were numeric (which sex is not) then
model_scaled <- glmer(...,data=scale(mydata))
would work: looking at lm.beta and Make.Z in the QuantPsyc package (you
didn't tell us where lm.beta() came from ...), Make.Z seems (as far as I
can tell) to replicate the behavior of the built-in scale() function.
But that approach won't work properly for factors with more than two
levels ...
lm.beta
function (MOD)
{
b <- summary(MOD)$coef[-1, 1]
sx <- sd(MOD$model[-1])
sy <- sd(MOD$model[1])
beta <- b * sx/sy
return(beta)
}
lm.beta.lmer <- function(mod) {
b <- fixef(mod)[-1] ## fixed-effect coefs, sans intercept
sd.x <- apply(mod at X[,-1],2,sd) ## pull out model (design) matrix,
## drop intercept column, calculate
## sd of remaining columns
sd.y <- sd(mod at y) ## sd of response
b*sd.x/sd.y
}
library(nlme)
data(Orthodont)
dat <- as.data.frame(Orthodont)
detach(package:nlme)
library(lme4)
fm2 <- lmer(distance ~ age + Sex + (age|Subject), data = dat)
lm.beta.lmer(fm2)
For this example (which like yours has Sex, a two-level factor, as its
only non-numeric predictor) we can show that we get the same answer (up
pdat <- with(dat,cbind(distance,age,s=as.numeric(Sex)))
pdat <- scale(pdat)
dat2 <- data.frame(pdat,Subject=dat$Subject)
fm3 <- lmer(distance ~ age + s + (age|Subject), data = dat2)
fixef(fm3)
The only remaining question I have is whether it makes sense to scale
by sd(y) in this case -- may not generalize to the GLM case from the LM
case? But you should have the correct *relative* magnitudes of
parameters in any case.
good luck,
Ben Bolker

Simon Sun

2014-05-03 09:29:40 UTC

Permalink

Dear Ben,

I am a user of your excellent work lme4. While during my work I realized the
problem that the package doesn't provide function to return standardized
coefficients.

I found a section of code on your website:
##
lm.beta.lmer <- function(mod) {
b <- fixef(mod)[-1] ## fixed-effect coefs, sans intercept
sd.x <- apply(mod @ X[,-1],2,sd) ## pull out model (design) matrix,
## drop intercept column, calculate
## sd of remaining columns
sd.y <- sd(mod @ y) ## sd of response
b*sd.x/sd.y
}
##

But I couldn't make it work, every time it showed:

Error in apply(Mod at x[, -1], 2, sd) : no slot of name "x" for this object of
class "lmerMod"

Could you please help me figure it out? Thank you very much!

Andrea Cantieni

2014-05-05 14:54:33 UTC

Permalink

Ein eingebundener Text mit undefiniertem Zeichensatz wurde abgetrennt.
Name: nicht verf?gbar
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20140505/3bd790d4/attachment.pl>