[R-sig-ME] Random intercept model- unbalanced cluster

Discussion:

Yashree Mehta

2018-10-25 16:00:27 UTC

Hi,

I am working with a random intercept model on a cluster dataset (Repeated
measurements of plots per household). I have the usual "X" vector
of covariates and one id variable which will make up the random
intercept. For example,

Response variable: Production of maize
Covariates: Size, input quantities, soil fertility dummies etc..
ID variable: Household_ID

However, about 40% of the households own one plot. The number of plots per
household ranges from 1 to 13.

When I estimated the random intercept model using lmer, I can extract a
random intercept for all households, irrespective of their number of plots.

How does lmer treat these households with just 1 plot? Also, is it
theoretically correct to include these observations ?

Thank you,

Regards,
Yashree

[[alternative HTML version deleted]]

Yashree Mehta

2018-10-29 21:13:06 UTC

Permalink

Or is there an alternative method of modeling this subset of households who
only own one plot?

thank you,

Regards,
Yashree

Post by Yashree Mehta
Hi,
I am working with a random intercept model on a cluster dataset (Repeated
measurements of plots per household). I have the usual "X" vector
of covariates and one id variable which will make up the random
intercept. For example,
Response variable: Production of maize
Covariates: Size, input quantities, soil fertility dummies etc..
ID variable: Household_ID
However, about 40% of the households own one plot. The number of plots per
household ranges from 1 to 13.
When I estimated the random intercept model using lmer, I can extract a
random intercept for all households, irrespective of their number of plots.
How does lmer treat these households with just 1 plot? Also, is it
theoretically correct to include these observations ?
Thank you,
Regards,
Yashree

[[alternative HTML version deleted]]

Ben Bolker

2018-10-29 21:23:16 UTC

Permalink

In principle lme4 shouldn't have problems with a subset of groups that
have only one observation (although clearly the model will get more
fragile/unreliable the less information is available about within vs
among group variation ...). I'd expect the random effects for groups
with only one observation to be strongly shrunk toward the population
mean ... if in doubt, it can be very useful to simulate a situation
similar to your real data set to see what happens in cases where you
know the real answer ...

Post by Yashree Mehta
Or is there an alternative method of modeling this subset of households who
only own one plot?
thank you,
Regards,
Yashree

[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models