[R-sig-ME] Group-level predictors which impact the random intercept

Discussion:

Yashree Mehta

2018-06-10 17:00:21 UTC

Hello,

I had recently posted the following for understanding the syntax for adding
group-level predictors in a random intercept model:

"""""

Hi,
I am working with a random intercept model. I have the usual "X" vector of
covariates and one id variable which will make up the random intercept. Now
I wish to add group-level predictors (which are NOT in the X vector) such
that the random intercept depends on these predictors.
For example,
Response variable: Production of maize
Covariate: Size of plot
Group-level predictor: Age of farmer
ID variable: Household_ID

I wish to confirm the syntax for including the group-level "Age of farmer"
variable.
fit<-lmer(Production~ Size+ Age+ (1|Household_ID), data=data)

Is this correct or is there another way of declaring the group-level
predictor in the formula?

"""""

This syntax had been confirmed as correct. Now I am wondering how does lmer
really distinguish between the usual X covariates and group-level
predictors? We have not really differentiated them in the formula. How does
lmer construe Age to only impact the random intercept?

Thank you very much,

Regards,

Yashree

[[alternative HTML version deleted]]

Douglas Bates

2018-06-11 16:11:04 UTC

Permalink

Thank you for transferring the discussion over to the R-SIG-Mixed-Models
group.

As I mentioned in the email discussion, the issue of covariates in the
fixed-effects terms and whether or not they vary within the levels of a
grouping factor for random-effects terms is a consequence of the way the
model is described in the multilevel modeling literature. In other words,
there is no inherent problem with defining a mixed-effects model involving
a fixed-effect for Age even though Age does not change within
Household_ID. When multilevel models were being formulated many years ago
an approach to how one would estimate the parameters leaked over into the
model definition. It became important to formulate models within models
within ... but that approach is unnecessary and led to many
misconceptions. Furthermore, the approach is too restrictive. A
multilevel model cannot accommodate crossed random effects, such as subject
and item, or partially crossed random effects such as child, teacher and
school in longitudinal data.

To me one of the most important innovations in the lme4 package was to
reformulate the evaluation of the deviance for a linear mixed-effects model
as a penalized least squares problem and to employ a sparse Cholesky
factorization to solve a modified version of Henderson's mixed-model
equations. This is described in our 2015 J. of Statistical Software
paper. It is not important for every user of the lme4 package to
understand the mathematics of the derivation but it helps to know that the
model can be formulated and the parameters can be estimated as described
there. The fact that other and, I think it is fair to say, inferior
formulations and estimation methods exist is not relevant.

Post by Yashree Mehta
Hello,
I had recently posted the following for understanding the syntax for adding
"""""
Hi,
I am working with a random intercept model. I have the usual "X" vector of
covariates and one id variable which will make up the random intercept. Now
I wish to add group-level predictors (which are NOT in the X vector) such
that the random intercept depends on these predictors.
For example,
Response variable: Production of maize
Covariate: Size of plot
Group-level predictor: Age of farmer
ID variable: Household_ID
I wish to confirm the syntax for including the group-level "Age of farmer"
variable.
fit<-lmer(Production~ Size+ Age+ (1|Household_ID), data=data)
Is this correct or is there another way of declaring the group-level
predictor in the formula?
"""""
This syntax had been confirmed as correct. Now I am wondering how does lmer
really distinguish between the usual X covariates and group-level
predictors? We have not really differentiated them in the formula. How does
lmer construe Age to only impact the random intercept?
Thank you very much,
Regards,
Yashree
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

[[alternative HTML version deleted]]

Yashree Mehta

2018-06-12 10:25:51 UTC

Permalink

Thank you very much for your response and explanation.

Regards,
Yashree

Post by Douglas Bates
Thank you for transferring the discussion over to the R-SIG-Mixed-Models
group.
As I mentioned in the email discussion, the issue of covariates in the
fixed-effects terms and whether or not they vary within the levels of a
grouping factor for random-effects terms is a consequence of the way the
model is described in the multilevel modeling literature. In other words,
there is no inherent problem with defining a mixed-effects model involving
a fixed-effect for Age even though Age does not change within
Household_ID. When multilevel models were being formulated many years ago
an approach to how one would estimate the parameters leaked over into the
model definition. It became important to formulate models within models
within ... but that approach is unnecessary and led to many
misconceptions. Furthermore, the approach is too restrictive. A
multilevel model cannot accommodate crossed random effects, such as subject
and item, or partially crossed random effects such as child, teacher and
school in longitudinal data.
To me one of the most important innovations in the lme4 package was to
reformulate the evaluation of the deviance for a linear mixed-effects model
as a penalized least squares problem and to employ a sparse Cholesky
factorization to solve a modified version of Henderson's mixed-model
equations. This is described in our 2015 J. of Statistical Software
paper. It is not important for every user of the lme4 package to
understand the mathematics of the derivation but it helps to know that the
model can be formulated and the parameters can be estimated as described
there. The fact that other and, I think it is fair to say, inferior
formulations and estimation methods exist is not relevant.

[[alternative HTML version deleted]]