Voeten, C.C.
2018-08-21 17:41:04 UTC
When doing forward stepwise regression, a computationally efficient way to choose the next term to add to the model out of a set of candidate predictors, is to calculate the correlation of each of these predictors with the residuals of the current working model. E.g., if my current model is lm(y ~ something,data=data), and I need to choose which of a set of predictors {b1, b2, b3} to add next, the largest result of sapply(data[,c('b1','b2','b3')],cor,resid(current_model)) is the predictor I should pick.
How does this extend to the random effects of mixed-effects models?
I can foresee two issues. The first is that a random effect cannot be represented by a single column vector out of a data set, so we can't use cor(). However, it should be possible to instead regress the residuals on each of the candidate random effects, and select the effect that gives the largest log-likelihood.
A second issue I could think of is that the parameters will be optimized differently. The theta parameters will be optimized sequentially instead of jointly: every future predictor added to the model will be evaluated with the theta parameters from the preceding random effects treated as fixed. I am unsure what impact this will have -- is this known (or perhaps even obvious)?
My use case is that I often find that I have to fit large models with multiple crossed random slopes, and I know that the full model will never converge. I want to be sure that the random effects which I do include are the best possible ones I could have chosen. What I do now is start out with all fixed effects, and try all my random effects one at a time (respecting marginality), and so on, until I have identified the maximal model that will still converge. This works well, but is computationally very, very wasteful. I was wondering if this more efficient approach used in simple linear models (using the correlation of the candidate predictors with the current model's residuals) could in any way be applied to mixed models as well, and at what cost...
Thanks,
Cesko
How does this extend to the random effects of mixed-effects models?
I can foresee two issues. The first is that a random effect cannot be represented by a single column vector out of a data set, so we can't use cor(). However, it should be possible to instead regress the residuals on each of the candidate random effects, and select the effect that gives the largest log-likelihood.
A second issue I could think of is that the parameters will be optimized differently. The theta parameters will be optimized sequentially instead of jointly: every future predictor added to the model will be evaluated with the theta parameters from the preceding random effects treated as fixed. I am unsure what impact this will have -- is this known (or perhaps even obvious)?
My use case is that I often find that I have to fit large models with multiple crossed random slopes, and I know that the full model will never converge. I want to be sure that the random effects which I do include are the best possible ones I could have chosen. What I do now is start out with all fixed effects, and try all my random effects one at a time (respecting marginality), and so on, until I have identified the maximal model that will still converge. This works well, but is computationally very, very wasteful. I was wondering if this more efficient approach used in simple linear models (using the correlation of the candidate predictors with the current model's residuals) could in any way be applied to mixed models as well, and at what cost...
Thanks,
Cesko