Discussion:
[R-sig-ME] Offset vs fixed factor in a mixed poisson model
v_coudrain
2013-01-17 10:08:51 UTC
Permalink
Dear subscribers,
I am tested the effect of a factor on a count variable using a poisson mixed model. I know that my response variable is linearly influenced by an other variable so
that I would like to remove the effect of this second variable to see the true effect of my factor. In an anova, it is usual to enter the covariable first in the model and
use a sequential test (type I SS). However I am a bit confused how to control for this covariable in my mixed-poisson model. If I just give the covariable as an
additional fixed variable, my factor is highly significant. If I put it instead as an offset, the factor is not significant at all. I think that it is better to use offset, but I must
admit that the underlying "theory" is not clear for me. I was also wondering if we can specify multiple offsets and if there was some "rule of thumb" in the maximal
number that can be included. Thank you very much.
Best,
Valerie
___________________________________________________________
Envie de changer de frigo ou de gazini?re ? Les soldes ?lectrom?nager sont sur Voila.fr http://shopping.voila.fr/vitrine/electromenager
Ben Bolker
2013-01-17 16:15:57 UTC
Permalink
Post by v_coudrain
I am tested the effect of a factor on a count variable using a
poisson mixed model. I know that my response variable is linearly
influenced by an other variable so that I would like to remove the
effect of this second variable to see the true effect of my
factor. In an anova, it is usual to enter the covariable first in
the model and use a sequential test (type I SS). However I am a bit
confused how to control for this covariable in my mixed-poisson
model. If I just give the covariable as an additional fixed
variable, my factor is highly significant. If I put it instead as an
offset, the factor is not significant at all. I think that it is
better to use offset, but I must admit that the underlying "theory"
is not clear for me. I was also wondering if we can specify multiple
offsets and if there was some "rule of thumb" in the maximal number
that can be included. Thank you very much. Best, Valerie
You can specify as many offsets as you want. The distinction
between an offset and a covariate is that an offset is entered
in the equation *exactly as is*, while a covariate has an estimated
parameter associated with it. For example,

y ~ x1 + offset(log(x2))

would fit the model

y ~ Poisson(lambda=exp(b_1*x1+log(x2))) =
Poisson(lambda=x2*exp(b_1*x1))

whereas

y ~ x1 + log(x2)

would fit the model

y ~ Poisson(lambda=exp(b_1*x1+b_2*log(x2))) =
Poisson(lambda=exp(b_1*x1+b_2*log(x2)) =
Poisson(lambda=x2^b_2*exp(b_1*x1))

(the log() are not required but are quite common when
specifying offsets, because that's the way to correct
for a scaling that is known to be proportional; using
y ~ x1 + offset(x2) would give lambda=exp(x2)*exp(b_1*x1)
which is not usually what's desired).

In your case, if the estimated parameter b_2 for the covariate
would be nowhere near 1.0, then your offset version is probably
not accounting properly for the effects of the covariate.

Ben Bolker
v_coudrain
2013-01-17 18:31:26 UTC
Permalink
Thank you for the explanation. Si I should estimate the parameter for my covariable. Is it correct to run the model with only the log-transformed parameter as an
explanatory variable and look at the estimate? I come to an estimated parameter of 0.8, which is quite close to 1. Is there any threshold, under which it is not
recommanded to use the variable as an offset? How can I control for my covariable if not as an offset since mixed-models are not performing sequential tests
(right?), such as in an anova?
Best,
Valerie
___________________________________________________________
Envie de changer de frigo ou de gazini?re ? Les soldes ?lectrom?nager sont sur Voila.fr http://shopping.voila.fr/vitrine/electromenager
Highland Statistics Ltd
2013-01-17 18:50:22 UTC
Permalink
Send R-sig-mixed-models mailing list submissions to
r-sig-mixed-models at r-project.org
To subscribe or unsubscribe via the World Wide Web, visit
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
or, via email, send a message with subject or body 'help' to
r-sig-mixed-models-request at r-project.org
You can reach the person managing the list at
r-sig-mixed-models-owner at r-project.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-sig-mixed-models digest..."
1. Offset vs fixed factor in a mixed poisson model
(v_coudrain at voila.fr)
----------------------------------------------------------------------
Message: 1
Date: Thu, 17 Jan 2013 11:08:51 +0100 (CET)
From: v_coudrain at voila.fr
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Offset vs fixed factor in a mixed poisson model
Message-ID: <436652575.252611358417331707.JavaMail.www at wwinf7130>
Content-Type: text/plain; charset=UTF-8
Dear subscribers,
Valerie,
I am tested the effect of a factor on a count variable using a poisson mixed model. I know that my response variable is linearly influenced by an other variable so
Keep in mind that you are using an exponential relationship in a
GLM...at least if you use the log link.
that I would like to remove the effect of this second variable to see the true effect of my factor. In an anova, it is usual to enter the covariable first in the model and
use a sequential test (type I SS). However I am a bit confused how to control for this covariable in my mixed-poisson model. If I just give the covariable as an
additional fixed variable, my factor is highly significant. If I put it instead as an offset, the factor is not significant at all. I think that it is better to use offset, but I must
If you use a covariate as an offset then you essentially saying: double
the value of the variable used for the offset, double the numbers
(strictly speaking: the expected value). Quite often sampling effort is
used as an offset as it is not really interesting to model a
cause-effect relationship between sampling effort and your response.

If you have a model with:

glm(y ~ x, family = poisson)
glm(y ~ x + offset(z), family = poisson)

and x is significant in the first model...but not in the second, then
either the offset explains most variation, or x and the offset are
highly correlated? Plot x versus z...and plot x versus log(z)...

Alain
admit that the underlying "theory" is not clear for me. I was also wondering if we can specify multiple offsets and if there was some "rule of thumb" in the maximal
number that can be included. Thank you very much.
Best,
Valerie
___________________________________________________________
Envie de changer de frigo ou de gazini?re ? Les soldes ?lectrom?nager sont sur Voila.fr http://shopping.voila.fr/vitrine/electromenager
------------------------------
_______________________________________________
R-sig-mixed-models mailing list
R-sig-mixed-models at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
End of R-sig-mixed-models Digest, Vol 73, Issue 21
**************************************************
--
Dr. Alain F. Zuur
First author of:

1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7


2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9


3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3


4. Zero Inflated Models and Generalized Linear Mixed Models with R. (2012) Zuur, Saveliev, Ieno.
http://www.highstat.com/book4.htm

Other books: http://www.highstat.com/books.htm


Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
URL: www.brodgar.com
v_coudrain
2013-01-18 20:09:50 UTC
Permalink
Dear Alain,
Post by Highland Statistics Ltd
If you use a covariate as an offset then you essentially saying: double
the value of the variable used for the offset, double the numbers
(strictly speaking: the expected value).
What do you mean wirh "double the value"? Does it mean that if the value of the offset double, then the expected value of my response variable should double?
And if I have offset(logx), then doubling the log of my variable will double the estimate of the response variable?
Post by Highland Statistics Ltd
Quite often sampling effort is used as an offset as it is not really interesting to model a
cause-effect relationship between sampling effort and your response.
Indeed I don't directly have different sampling effort, but I am testing species richness in 3 years in a growing population, such that the abundance of individuals
strongly increased between the year. The situation is quite similar as if we had increased the sampling effort over the years.
Post by Highland Statistics Ltd
glm(y ~ x, family = poisson)
glm(y ~ x + offset(z), family = poisson)
and x is significant in the first model...but not in the second, then
either the offset explains most variation, or x and the offset are
highly correlated? Plot x versus z...and plot x versus log(z)...
x and z are indeed quite correlated, but it would be "nice" to see if x still explains some variation in my data independently of z.

Ben Bolker suggested that the parameter estimate for using a variable as an offset should be about one. What is your opinion on this?

Best,
Val?rie

___________________________________________________________
Envie de changer de frigo ou de gazini?re ? Les soldes ?lectrom?nager sont sur Voila.fr http://shopping.voila.fr/vitrine/electromenager
Highland Statistics Ltd
2013-01-18 20:29:34 UTC
Permalink
Post by v_coudrain
Dear Alain,
Post by Highland Statistics Ltd
If you use a covariate as an offset then you essentially saying: double
the value of the variable used for the offset, double the numbers
(strictly speaking: the expected value).
What do you mean wirh "double the value"? Does it mean that if the value of the offset double, then the expected value of my response variable should double?
And if I have offset(logx), then doubling the log of my variable will double the estimate of the response variable?
Valerie,
Yes...indeed that is what the offset is doing. Double the value of the
x....you assume that the expected value of your response also doubles.
Just write out the equation for a Poisson and you will see:

Y_i ~ Poisson(mu_i)
E(Y_i) = mu_i
mu_i = exp(alpha + beta * z + 1 * log(x))
= x* exp(alpha + beta * z)

Double x....double mu


Keep in mind that when you analyse a ratio you implicitly do the same;
1/2 = 100/ 200 = 0.5
Post by v_coudrain
Post by Highland Statistics Ltd
Quite often sampling effort is used as an offset as it is not really interesting to model a
cause-effect relationship between sampling effort and your response.
Indeed I don't directly have different sampling effort, but I am testing species richness in 3 years in a growing population, such that the abundance of individuals
strongly increased between the year. The situation is quite similar as if we had increased the sampling effort over the years.
Post by Highland Statistics Ltd
glm(y ~ x, family = poisson)
glm(y ~ x + offset(z), family = poisson)
and x is significant in the first model...but not in the second, then
either the offset explains most variation, or x and the offset are
highly correlated? Plot x versus z...and plot x versus log(z)...
x and z are indeed quite correlated, but it would be "nice" to see if x still explains some variation in my data independently of z.
'would be nice' and collinearity don't go together very well.
Post by v_coudrain
Ben Bolker suggested that the parameter estimate for using a variable as an offset should be about one. What is your opinion on this?
Ben is a clever cookie....and he is right.

Alain
Post by v_coudrain
Best,
Val?rie
___________________________________________________________
Envie de changer de frigo ou de gazini?re ? Les soldes ?lectrom?nager sont sur Voila.fr http://shopping.voila.fr/vitrine/electromenager
--
Dr. Alain F. Zuur
First author of:

1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7


2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9


3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3


4. Zero Inflated Models and Generalized Linear Mixed Models with R. (2012) Zuur, Saveliev, Ieno.
http://www.highstat.com/book4.htm

Other books: http://www.highstat.com/books.htm


Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
URL: www.brodgar.com
v_coudrain
2013-01-18 20:34:44 UTC
Permalink
Thank you very much. There is still a "small" problem. If then the estimate of the variable to be set as an offset is not around 1, I should not put it as an offset.
How do I then can control for its effect?

Best,
Val?rie
Message du 18/01/13 ? 21h29
De : "Highland Statistics Ltd"
A : v_coudrain at voila.fr
Copie ? : r-sig-mixed-models at r-project.org
Objet : Re: Offset vs fixed factor in a mixed poisson model
Post by v_coudrain
Dear Alain,
Post by Highland Statistics Ltd
If you use a covariate as an offset then you essentially saying: double
the value of the variable used for the offset, double the numbers
(strictly speaking: the expected value).
What do you mean wirh "double the value"? Does it mean that if the value of the offset double, then the expected value of my response variable should
double?
Post by v_coudrain
And if I have offset(logx), then doubling the log of my variable will double the estimate of the response variable?
Valerie,
Yes...indeed that is what the offset is doing. Double the value of the
x....you assume that the expected value of your response also doubles.
Y_i ~ Poisson(mu_i)
E(Y_i) = mu_i
mu_i = exp(alpha + beta * z + 1 * log(x))
= x* exp(alpha + beta * z)
Double x....double mu
Keep in mind that when you analyse a ratio you implicitly do the same;
1/2 = 100/ 200 = 0.5
Post by v_coudrain
Post by Highland Statistics Ltd
Quite often sampling effort is used as an offset as it is not really interesting to model a
cause-effect relationship between sampling effort and your response.
Indeed I don't directly have different sampling effort, but I am testing species richness in 3 years in a growing population, such that the abundance of
individuals
Post by v_coudrain
strongly increased between the year. The situation is quite similar as if we had increased the sampling effort over the years.
Post by Highland Statistics Ltd
glm(y ~ x, family = poisson)
glm(y ~ x + offset(z), family = poisson)
and x is significant in the first model...but not in the second, then
either the offset explains most variation, or x and the offset are
highly correlated? Plot x versus z...and plot x versus log(z)...
x and z are indeed quite correlated, but it would be "nice" to see if x still explains some variation in my data independently of z.
'would be nice' and collinearity don't go together very well.
Post by v_coudrain
Ben Bolker suggested that the parameter estimate for using a variable as an offset should be about one. What is your opinion on this?
Ben is a clever cookie....and he is right.
Alain
Post by v_coudrain
Best,
Val?rie
___________________________________________________________
Envie de changer de frigo ou de gazini?re ? Les soldes ?lectrom?nager sont sur Voila.fr http://shopping.voila.fr/vitrine/electromenager
--
Dr. Alain F. Zuur
1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7
2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9
3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3
4. Zero Inflated Models and Generalized Linear Mixed Models with R. (2012) Zuur, Saveliev, Ieno.
http://www.highstat.com/book4.htm
Other books: http://www.highstat.com/books.htm
Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
URL: www.brodgar.com
___________________________________________________________
Envie de changer de frigo ou de gazini?re ? Les soldes ?lectrom?nager sont sur Voila.fr http://shopping.voila.fr/vitrine/electromenager
Highland Statistics Ltd
2013-01-18 20:55:15 UTC
Permalink
Post by v_coudrain
Thank you very much. There is still a "small" problem. If then the estimate of the variable to be set as an offset is not around 1, I should not put it as an offset.
How do I then can control for its effect?
What about:

Y_i ~Poisson(mu_i)
log(mu_i) = alpha + beta_1 * x_i + beta_2 * z_i

That's a model where beta_1 shows the partial effect of x_i.....which
means...the effect of x_i while taking into account z_i..and vice versa.

But now your collinearity is going to cause some trouble. I am not sure
whether the partial linear regression equivalent for a Poisson GLMM
exists.....

Alain
Post by v_coudrain
Best,
Val?rie
Message du 18/01/13 ? 21h29
De : "Highland Statistics Ltd"
A : v_coudrain at voila.fr
Copie ? : r-sig-mixed-models at r-project.org
Objet : Re: Offset vs fixed factor in a mixed poisson model
Post by v_coudrain
Dear Alain,
Post by Highland Statistics Ltd
If you use a covariate as an offset then you essentially saying: double
the value of the variable used for the offset, double the numbers
(strictly speaking: the expected value).
What do you mean wirh "double the value"? Does it mean that if the value of the offset double, then the expected value of my response variable should
double?
Post by v_coudrain
And if I have offset(logx), then doubling the log of my variable will double the estimate of the response variable?
Valerie,
Yes...indeed that is what the offset is doing. Double the value of the
x....you assume that the expected value of your response also doubles.
Y_i ~ Poisson(mu_i)
E(Y_i) = mu_i
mu_i = exp(alpha + beta * z + 1 * log(x))
= x* exp(alpha + beta * z)
Double x....double mu
Keep in mind that when you analyse a ratio you implicitly do the same;
1/2 = 100/ 200 = 0.5
Post by v_coudrain
Post by Highland Statistics Ltd
Quite often sampling effort is used as an offset as it is not really interesting to model a
cause-effect relationship between sampling effort and your response.
Indeed I don't directly have different sampling effort, but I am testing species richness in 3 years in a growing population, such that the abundance of
individuals
Post by v_coudrain
strongly increased between the year. The situation is quite similar as if we had increased the sampling effort over the years.
Post by Highland Statistics Ltd
glm(y ~ x, family = poisson)
glm(y ~ x + offset(z), family = poisson)
and x is significant in the first model...but not in the second, then
either the offset explains most variation, or x and the offset are
highly correlated? Plot x versus z...and plot x versus log(z)...
x and z are indeed quite correlated, but it would be "nice" to see if x still explains some variation in my data independently of z.
'would be nice' and collinearity don't go together very well.
Post by v_coudrain
Ben Bolker suggested that the parameter estimate for using a variable as an offset should be about one. What is your opinion on this?
Ben is a clever cookie....and he is right.
Alain
Post by v_coudrain
Best,
Val?rie
___________________________________________________________
Envie de changer de frigo ou de gazini?re ? Les soldes ?lectrom?nager sont sur Voila.fr http://shopping.voila.fr/vitrine/electromenager
--
Dr. Alain F. Zuur
1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7
2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9
3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3
4. Zero Inflated Models and Generalized Linear Mixed Models with R. (2012) Zuur, Saveliev, Ieno.
http://www.highstat.com/book4.htm
Other books: http://www.highstat.com/books.htm
Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
URL: www.brodgar.com
___________________________________________________________
Envie de changer de frigo ou de gazini?re ? Les soldes ?lectrom?nager sont sur Voila.fr http://shopping.voila.fr/vitrine/electromenager
--
Dr. Alain F. Zuur
First author of:

1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7


2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9


3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3


4. Zero Inflated Models and Generalized Linear Mixed Models with R. (2012) Zuur, Saveliev, Ieno.
http://www.highstat.com/book4.htm

Other books: http://www.highstat.com/books.htm


Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
URL: www.brodgar.com
Loading...