[R-sig-ME] Using variance components of lmer for ICC computation in reliability study

Well no, you¹re specification is not right because your variable is not
continuous as you note. Continuous means it is a real number between
-Inf/Inf and you have boundaries between 1 and 10. So, you should not be
using a linear model assuming the outcome is continuous.

Post by Bernard Liew
Dear Community,
I am doing a reliability study, using the methods of
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question on the
lmer formulation and the use of the variance components.
Background: I have 20 subjects, 2 fixed raters, 2 testing sessions, and
10 trials per sessions. my dependent variable is a continuous variable
(scale 1-10). Sessions are nested within each subject-assessor
combination. I desire a ICC (3) formulation of inter-rater and
inter-session reliability from the variance components.
lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)
1. is the model formulation right? and is my interpretation of the
variance components for ICC below right?
2. inter-rater ICC = var (subj) / (var(subj) + var (residual)) # I
read that the variation of raters will be lumped with the residual
3. inter-session ICC =( var (subj) + var (residual)) /( var (subj) +
var (subj:session) + var (residual))
df = expand.grid(subj = c(1:20), rater = c(1:2), session = c(1:2), trial
= c(1:10))
df$vas = rnorm (nrow (df_sim), mean = 3, sd = 1.5)
I appreciate the kind response.
Kind regards,
Bernard
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Rolf Turner

2018-06-14 22:33:44 UTC

Post by Doran, Harold
Well no, you¹re specification is not right because your variable is not
continuous as you note. Continuous means it is a real number between
-Inf/Inf and you have boundaries between 1 and 10. So, you should not be
using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading. For a variable to be
continuous it is not necessary for it to have a range from -infinity to
infinity.

The OP says that dv "is a continuous variable (scale 1-10)". It is not
clear to me what this means. The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive integer)
values 1, 2, ..., 10. If this is so, then a continuous model is not
appropriate. (It should be noted however that people in the social
sciences do this sort of thing --- i.e. treat discrete variables as
continuous --- all the time.)

It is *possible* that dv can take values in the real interval [1,10], in
which case it *is* continuous, and a "continuous model" is indeed
appropriate.

The OP should clarify what the situation actually is.

cheers,

Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

Doran, Harold

2018-06-14 22:58:59 UTC

That’s a helpful clarification, Rolf. However, with gaussian normal errors
in the linear model, we can’t *really* assume they would asymptote at 1 or
10. My suspicion is that these are likert-style ordered counts of some
form, although the OP should clarify. In which case, the 1 or 10 are
limits with censoring, as true values for some measured trait could exist
outside those boundaries (and I suspect the model is forming predicted
values outside of 1 or 10).

I think that the foregoing is a bit misleading. For a variable to be
continuous it is not necessary for it to have a range from -infinity to
infinity.
The OP says that dv "is a continuous variable (scale 1-10)". It is not
clear to me what this means. The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive integer)
values 1, 2, ..., 10. If this is so, then a continuous model is not
appropriate. (It should be noted however that people in the social
sciences do this sort of thing --- i.e. treat discrete variables as
continuous --- all the time.)
It is *possible* that dv can take values in the real interval [1,10], in
which case it *is* continuous, and a "continuous model" is indeed
appropriate.
The OP should clarify what the situation actually is.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

Ben Bolker

2018-06-15 01:27:54 UTC

More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform. However, I'm not sure how one would go about
computing an ICC from ordinal data ... (the 'ordinal' package is the
place to look for the model-fitting procedures). Googling it finds
some stuff, but it seems that it doesn't necessarily apply to complex
designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliability-for-ordinal-or-interval-data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

Post by Doran, Harold
That’s a helpful clarification, Rolf. However, with gaussian normal errors
in the linear model, we can’t *really* assume they would asymptote at 1 or
10. My suspicion is that these are likert-style ordered counts of some
form, although the OP should clarify. In which case, the 1 or 10 are
limits with censoring, as true values for some measured trait could exist
outside those boundaries (and I suspect the model is forming predicted
values outside of 1 or 10).

I think that the foregoing is a bit misleading. For a variable to be
continuous it is not necessary for it to have a range from -infinity to
infinity.
The OP says that dv "is a continuous variable (scale 1-10)". It is not
clear to me what this means. The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive integer)
values 1, 2, ..., 10. If this is so, then a continuous model is not
appropriate. (It should be noted however that people in the social
sciences do this sort of thing --- i.e. treat discrete variables as
continuous --- all the time.)
It is *possible* that dv can take values in the real interval [1,10], in
which case it *is* continuous, and a "continuous model" is indeed
appropriate.
The OP should clarify what the situation actually is.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Doran, Harold

2018-06-15 08:57:01 UTC

It seems to me you actually have censored data of three types, left, right and within the intervals. You might find it helpful to review this paper and see how models like yours can be estimated.

https://cran.r-project.org/web/packages/censReg/vignettes/censReg.pdf

From: Bernard Liew <***@bham.ac.uk<mailto:***@bham.ac.uk>>
Date: Friday, June 15, 2018 at 1:20 AM
To: Ben Bolker <***@gmail.com<mailto:***@gmail.com>>, AIR <***@air.org<mailto:***@air.org>>
Cc: Rolf Turner <***@auckland.ac.nz<mailto:***@auckland.ac.nz>>, "r-sig-mixed-***@r-project.org<mailto:r-sig-mixed-***@r-project.org>" <r-sig-mixed-***@r-project.org<mailto:r-sig-mixed-***@r-project.org>>
Subject: RE: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

Thanks all,

My original question appears to be now two (1) the distribution of my DV (hence what models to use; (2) specification of my lmer model to parse out variance components.

Topic (1): DV distribution

Yes, my measure is a sliding rule between 1-10 of subjective pain, so any number up to a single decimal is plausible. Is a linear model automatically excluded, or can (a) do a fitted/residual plot for checking; (b) log transform the dv if (a) shows evidence of non-normality.

Going back to Rolf's point of social science, you are right. But realistically, many measures in biomechanics (which I am in), are analyzed using linear models, even though they are bounded. Example, a simple scalar height is bounded to a lower limit of zero, and an upper limit of what ever instrument is created. Joint angles are bounded physiologically too. So when are measures really -inf/inf?

Topic (2): lmer

Assuming my DV is appropriate for lmer, base on the experimental design used, I hope to receive some feedback on my fixed and random effects specification still 😊

Thanks again all, for the kind response

Bernard

-----Original Message-----
From: ***@gmail.com<mailto:***@gmail.com> <***@gmail.com<mailto:***@gmail.com>>
Sent: Friday, June 15, 2018 2:28 AM
To: Doran, Harold <***@air.org<mailto:***@air.org>>
Cc: Rolf Turner <***@auckland.ac.nz<mailto:***@auckland.ac.nz>>; r-sig-mixed-***@r-project.org<mailto:r-sig-mixed-***@r-project.org>; Bernard Liew <***@bham.ac.uk<mailto:***@bham.ac.uk>>
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

More generally, the best way to fit this kind of model is to use an

*ordinal* model, which assumes the responses are in increasing sequence but does not assume the distance between levels (e.g. 1 vs 2,

2 vs 3 ...) is uniform. However, I'm not sure how one would go about computing an ICC from ordinal data ... (the 'ordinal' package is the place to look for the model-fitting procedures). Googling it finds some stuff, but it seems that it doesn't necessarily apply to complex designs ...

https://stats.stackexchange.com/questions/3539/inter-rater-reliability-for-ordinal-or-interval-data

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

Post by Doran, Harold
That’s a helpful clarification, Rolf. However, with gaussian normal
errors in the linear model, we can’t *really* assume they would
asymptote at 1 or 10. My suspicion is that these are likert-style
ordered counts of some form, although the OP should clarify. In which
case, the 1 or 10 are limits with censoring, as true values for some
measured trait could exist outside those boundaries (and I suspect the
model is forming predicted values outside of 1 or 10).

Post by Doran, Harold
Well no, you¹re specification is not right because your variable is
not continuous as you note. Continuous means it is a real number
between -Inf/Inf and you have boundaries between 1 and 10. So, you
should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading. For a variable to be
continuous it is not necessary for it to have a range from -infinity
to infinity.
The OP says that dv "is a continuous variable (scale 1-10)". It is
not clear to me what this means. The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive integer)
values 1, 2, ..., 10. If this is so, then a continuous model is not
appropriate. (It should be noted however that people in the social
sciences do this sort of thing --- i.e. treat discrete variables as
continuous --- all the time.)
It is *possible* that dv can take values in the real interval [1,10],
in which case it *is* continuous, and a "continuous model" is indeed
appropriate.
The OP should clarify what the situation actually is.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

Post by Bernard Liew
Dear Community,
I am doing a reliability study, using the methods of
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question on
the lmer formulation and the use of the variance components.
Background: I have 20 subjects, 2 fixed raters, 2 testing sessions,
and
10 trials per sessions. my dependent variable is a continuous
variable (scale 1-10). Sessions are nested within each
subject-assessor combination. I desire a ICC (3) formulation of
inter-rater and inter-session reliability from the variance components.
lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)
1. is the model formulation right? and is my interpretation of
the variance components for ICC below right?
2. inter-rater ICC = var (subj) / (var(subj) + var (residual)) #
I read that the variation of raters will be lumped with the residual
3. inter-session ICC =( var (subj) + var (residual)) /( var
df = expand.grid(subj = c(1:20), rater = c(1:2), session = c(1:2),
trial = c(1:10)) df$vas = rnorm (nrow (df_sim), mean = 3, sd =
1.5)
I appreciate the kind response.

_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

[[alternative HTML version deleted]]

Pierre de Villemereuil

2018-06-15 15:05:04 UTC

Hi,

Post by Doran, Harold
However, I'm not sure how one would go about computing an ICC from ordinal data

I've never used the package "ordinal", but if it's anything like the "ordinal" family of MCMCglmm, then computing an ICC on the liability scale would be fairly easy, as one would just proceed as always and simply add the so-called "link variance" corresponding to the chosen link function (1 for probit, (pi^2)/3 for logit). E.g. for a given variance component Vcomp and a probit link:
ICC = Vcomp / (sum(variance components of the model) + 1)

However, computing an ICC on the data scale would be much more difficult as it is actually multivariate...

I think in the case when such scores were used, having the estimate on the liability scale make sense though, as the binning is more due to our inability of finely measuring this scale rather than an actual property of the system.

Cheers,
Pierre.

Post by Doran, Harold
More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform. However, I'm not sure how one would go about
computing an ICC from ordinal data ... (the 'ordinal' package is the
place to look for the model-fitting procedures). Googling it finds
some stuff, but it seems that it doesn't necessarily apply to complex
designs ...
https://stats.stackexchange.com/questions/3539/inter-rater-reliability-for-ordinal-or-interval-data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

I think that the foregoing is a bit misleading. For a variable to be
continuous it is not necessary for it to have a range from -infinity to
infinity.
The OP says that dv "is a continuous variable (scale 1-10)". It is not
clear to me what this means. The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive integer)
values 1, 2, ..., 10. If this is so, then a continuous model is not
appropriate. (It should be noted however that people in the social
sciences do this sort of thing --- i.e. treat discrete variables as
continuous --- all the time.)
It is *possible* that dv can take values in the real interval [1,10], in
which case it *is* continuous, and a "continuous model" is indeed
appropriate.
The OP should clarify what the situation actually is.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Doran, Harold

2018-06-15 15:31:55 UTC

This seems to me like the tail is wagging the dog. I don't think the question should be framed around, "what can I do to get an ICC?" The question should be around what distributional family is appropriate given the observed data?

If an ICC cannot be made available after the appropriate link function is chosen, then I'm not sure that's really of any consequence. The point of a random effects model is *not* to yield an ICC. But, to yield appropriate estimates for the fixed effects and marginal variances.

If these data are treated as ordinal, that's a LOT of threshold parameters to estimate and might be quite expensive to compute.

-----Original Message-----
From: Pierre de Villemereuil [mailto:***@mailoo.org]
Sent: Friday, June 15, 2018 11:05 AM
To: r-sig-mixed-***@r-project.org
Cc: Ben Bolker <***@gmail.com>; Doran, Harold <***@air.org>; Bernard Liew <***@bham.ac.uk>
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study

Hi,

Post by Doran, Harold
However, I'm not sure how one would go about computing an ICC from ordinal data

Post by Doran, Harold
More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform. However, I'm not sure how one would go about
computing an ICC from ordinal data ... (the 'ordinal' package is the
place to look for the model-fitting procedures). Googling it finds
some stuff, but it seems that it doesn't necessarily apply to complex
designs ...
https://stats.stackexchange.com/questions/3539/inter-rater-reliability
-for-ordinal-or-interval-data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

Post by Doran, Harold
That’s a helpful clarification, Rolf. However, with gaussian normal
errors in the linear model, we can’t *really* assume they would
asymptote at 1 or 10. My suspicion is that these are likert-style
ordered counts of some form, although the OP should clarify. In
which case, the 1 or 10 are limits with censoring, as true values
for some measured trait could exist outside those boundaries (and I
suspect the model is forming predicted values outside of 1 or 10).

Post by Doran, Harold
Well no, you¹re specification is not right because your variable
is not continuous as you note. Continuous means it is a real
number between -Inf/Inf and you have boundaries between 1 and 10.
So, you should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading. For a variable to
be continuous it is not necessary for it to have a range from
-infinity to infinity.
The OP says that dv "is a continuous variable (scale 1-10)". It is
not clear to me what this means. The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive
integer) values 1, 2, ..., 10. If this is so, then a continuous
model is not appropriate. (It should be noted however that people
in the social sciences do this sort of thing --- i.e. treat discrete
variables as continuous --- all the time.)
It is *possible* that dv can take values in the real interval
[1,10], in which case it *is* continuous, and a "continuous model"
is indeed appropriate.
The OP should clarify what the situation actually is.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

Post by Bernard Liew
Dear Community,
I am doing a reliability study, using the methods of
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question
on the lmer formulation and the use of the variance components.
Background: I have 20 subjects, 2 fixed raters, 2 testing
sessions, and
10 trials per sessions. my dependent variable is a continuous
variable (scale 1-10). Sessions are nested within each
subject-assessor combination. I desire a ICC (3) formulation of
inter-rater and inter-session reliability from the variance components.
lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)
1. is the model formulation right? and is my interpretation of
the variance components for ICC below right?
2. inter-rater ICC = var (subj) / (var(subj) + var (residual))
# I read that the variation of raters will be lumped with the residual
3. inter-session ICC =( var (subj) + var (residual)) /( var
(subj) + var (subj:session) + var (residual)) some simulated
df = expand.grid(subj = c(1:20), rater = c(1:2), session =
c(1:2), trial = c(1:10)) df$vas = rnorm (nrow (df_sim), mean =
3, sd = 1.5)
I appreciate the kind response.

_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

2018-06-16 17:08:57 UTC

These extra terms are (I think) the 'residual' variances corresponding
to different GLM families, which are fixed rather than estimated. You
might find Nakagawa and Schielzeth "Repeatability for Gaussian and
non-Gaussian data" (Biological Reviews 2010) useful ...

Thanks Pierre for the suggestion to use MCMCglmm. Very useful.
1) Can I ask why when taking the ratio of the variance components, the denominator is ICC = Vcomp / (sum(variance components of the model) + 1)? Why is the one added?
2) I have tried mixed ordinal modelling using "ordinal" or MCMCglmm, and noticed the variance of the residuals of the model is not produced. I have also read that the residual is assumed to be as you mentioned (pi^2)/3. Is there a reason (maybe a not so technical one?)
Kind regards,
Bernard
-----Original Message-----
Sent: Friday, June 15, 2018 4:05 PM
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study
Hi,

Post by Doran, Harold
However, I'm not sure how one would go about computing an ICC from ordinal data

ICC = Vcomp / (sum(variance components of the model) + 1)
However, computing an ICC on the data scale would be much more difficult as it is actually multivariate...
I think in the case when such scores were used, having the estimate on the liability scale make sense though, as the binning is more due to our inability of finely measuring this scale rather than an actual property of the system.
Cheers,
Pierre.

Post by Doran, Harold
More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform. However, I'm not sure how one would go about
computing an ICC from ordinal data ... (the 'ordinal' package is the
place to look for the model-fitting procedures). Googling it finds
some stuff, but it seems that it doesn't necessarily apply to complex
designs ...
https://stats.stackexchange.com/questions/3539/inter-rater-reliability
-for-ordinal-or-interval-data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

Post by Doran, Harold
That’s a helpful clarification, Rolf. However, with gaussian normal
errors in the linear model, we can’t *really* assume they would
asymptote at 1 or 10. My suspicion is that these are likert-style
ordered counts of some form, although the OP should clarify. In
which case, the 1 or 10 are limits with censoring, as true values
for some measured trait could exist outside those boundaries (and I
suspect the model is forming predicted values outside of 1 or 10).

Post by Doran, Harold
Well no, you¹re specification is not right because your variable
is not continuous as you note. Continuous means it is a real
number between -Inf/Inf and you have boundaries between 1 and 10.
So, you should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading. For a variable to
be continuous it is not necessary for it to have a range from
-infinity to infinity.
The OP says that dv "is a continuous variable (scale 1-10)". It is
not clear to me what this means. The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive
integer) values 1, 2, ..., 10. If this is so, then a continuous
model is not appropriate. (It should be noted however that people
in the social sciences do this sort of thing --- i.e. treat discrete
variables as continuous --- all the time.)
It is *possible* that dv can take values in the real interval
[1,10], in which case it *is* continuous, and a "continuous model"
is indeed appropriate.
The OP should clarify what the situation actually is.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

Post by Bernard Liew
Dear Community,
I am doing a reliability study, using the methods of
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question
on the lmer formulation and the use of the variance components.
Background: I have 20 subjects, 2 fixed raters, 2 testing
sessions, and
10 trials per sessions. my dependent variable is a continuous
variable (scale 1-10). Sessions are nested within each
subject-assessor combination. I desire a ICC (3) formulation of
inter-rater and inter-session reliability from the variance components.
lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)
1. is the model formulation right? and is my interpretation of
the variance components for ICC below right?
2. inter-rater ICC = var (subj) / (var(subj) + var (residual))
# I read that the variation of raters will be lumped with the residual
3. inter-session ICC =( var (subj) + var (residual)) /( var
(subj) + var (subj:session) + var (residual)) some simulated
df = expand.grid(subj = c(1:20), rater = c(1:2), session =
c(1:2), trial = c(1:10)) df$vas = rnorm (nrow (df_sim), mean =
3, sd = 1.5)
I appreciate the kind response.

_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Pierre de Villemereuil

2018-06-16 18:28:17 UTC

Dear Bernard,

Thanks Pierre for the suggestion to use MCMCglmm. Very useful.

Glad this was helpful.

1) Can I ask why when taking the ratio of the variance components, the denominator is ICC = Vcomp / (sum(variance components of the model) + 1)? Why is the one added?

You can find an explanation for this in the Supplementary File on our article on quantitative genetics interpretation of GLMMs:
http://www.genetics.org/content/204/3/1281.supplemental

It is fairly technical. Trying to put it simply: there is an equivalence between a binomial/probit model and the threshold model. This is because a probit link is actually the CDF of a normal distribution (same goes for a logit and a logistic distribution): using a probit link and a binomial is equivalent to using a threshold model after adding a random noise normally distributed (this is Fig. S1 in our Suppl. File above).

And the variance of this random noise is thus the variance of a standard normal distribution which is 1 (or the variance of a standard logistic distribution, which is (pi^2)/3, for a logit link).

Computing the ICC on the liability scale "as if" a threshold model was used thus only requires to add that extra-variance on the denominator.

2) I have tried mixed ordinal modelling using "ordinal" or MCMCglmm, and noticed the variance of the residuals of the model is not produced. I have also read that the residual is assumed to be as you mentioned (pi^2)/3. Is there a reason (maybe a not so technical one?)

It all depends what you call "residual". In MCMCglmm, there is for example a "residual" variance called "units" which value should be fixed to e.g. 1 when running families such as ordinal. This variance is required to be added in the denominator of the ICC. If you have only a random effect, you would have:
Vrand / (Vrand + Vunits + 1)

Note that there is also the "threshold" family in MCMCglmm, which is special in the sense that (to say it quickly) the "units" variance is the variance of the probit link and should be fixed to 1. In that case, you would have:
Vrand / (Vrand + Vunits)

Sometimes this "extra-variance" (1 or (pi^2)/3) is indeed called "residual variance". This is because, as per my explanation above, probit and logit links can be seen as "adding a random noise then taking a threshold", this random noise is sometimes seen as a "residual error" in the model.

I hope this will clarify things for you. I'm afraid it's difficult to explain clearly the whys without going too much into the technicality of the models.

Cheers,
Pierre.

Kind regards,
Bernard
-----Original Message-----
Sent: Friday, June 15, 2018 4:05 PM
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study
Hi,

Post by Doran, Harold
However, I'm not sure how one would go about computing an ICC from ordinal data

Post by Doran, Harold
More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform. However, I'm not sure how one would go about
computing an ICC from ordinal data ... (the 'ordinal' package is the
place to look for the model-fitting procedures). Googling it finds
some stuff, but it seems that it doesn't necessarily apply to complex
designs ...
https://stats.stackexchange.com/questions/3539/inter-rater-reliability
-for-ordinal-or-interval-data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

Post by Doran, Harold
That’s a helpful clarification, Rolf. However, with gaussian normal
errors in the linear model, we can’t *really* assume they would
asymptote at 1 or 10. My suspicion is that these are likert-style
ordered counts of some form, although the OP should clarify. In
which case, the 1 or 10 are limits with censoring, as true values
for some measured trait could exist outside those boundaries (and I
suspect the model is forming predicted values outside of 1 or 10).

Post by Doran, Harold
Well no, you¹re specification is not right because your variable
is not continuous as you note. Continuous means it is a real
number between -Inf/Inf and you have boundaries between 1 and 10.
So, you should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading. For a variable to
be continuous it is not necessary for it to have a range from
-infinity to infinity.
The OP says that dv "is a continuous variable (scale 1-10)". It is
not clear to me what this means. The "obvious"/usual meaning or
interpretation would be that dv can take (only) the (positive
integer) values 1, 2, ..., 10. If this is so, then a continuous
model is not appropriate. (It should be noted however that people
in the social sciences do this sort of thing --- i.e. treat discrete
variables as continuous --- all the time.)
It is *possible* that dv can take values in the real interval
[1,10], in which case it *is* continuous, and a "continuous model"
is indeed appropriate.
The OP should clarify what the situation actually is.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

Post by Bernard Liew
Dear Community,
I am doing a reliability study, using the methods of
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question
on the lmer formulation and the use of the variance components.
Background: I have 20 subjects, 2 fixed raters, 2 testing sessions, and
10 trials per sessions. my dependent variable is a continuous
variable (scale 1-10). Sessions are nested within each
subject-assessor combination. I desire a ICC (3) formulation of
inter-rater and inter-session reliability from the variance components.
lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)
1. is the model formulation right? and is my interpretation of
the variance components for ICC below right?
2. inter-rater ICC = var (subj) / (var(subj) + var (residual))
# I read that the variation of raters will be lumped with the residual
3. inter-session ICC =( var (subj) + var (residual)) /( var
(subj) + var (subj:session) + var (residual)) some simulated
df = expand.grid(subj = c(1:20), rater = c(1:2), session =
c(1:2), trial = c(1:10)) df$vas = rnorm (nrow (df_sim), mean =
3, sd = 1.5)
I appreciate the kind response.

_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Pierre de Villemereuil

2018-06-18 07:45:57 UTC

Hi,

When I refer to residuals, as I understand from lmer modelling, it is the level 1 variance. Back to my design, I have many subjects, two fixed raters, sessions nested within a subject, and trials nested within sessions. Hence, level 1 variance would typically reflect inter-trial variance.

Thing is: there is nothing like this for GLMMs. The lowest level of "residual variance" is basically the distribution variance.

You could think there would be such a thing with a threshold model, but it turns out that the total variance of the liability scale is non identifiable, so nothing there either.

If I were to use lmer modelling, I do not have to specify the level 1 random effects (i.e. ~1| SUBJ:SESSION:TRIAL), as it is in the residuals.

Indeed.

However, using mcmcglmm, there appears to be an even lower level of random effects which is fixed (as you mentioned). So my question is, should I specify also the level 1 random effects?

No, you shouldn't. There is already the "residual" variance in MCMCglmm (the R part of the variance) which should account for that, but that you need to fix to 1 (because of the non identifiability issue I'm mentioning above). If you had such kind of random effects, it means you re-model such a non identifiable variance.

mcglmm_mod = MCMCglmm(vas ~ RATER,
random = ~ SUBJ +
SUBJ:SESSION +
SUBJ:SESSION:TRIAL,
data = as.data.frame (df %>% filter (SIDE == "R")),
family = "ordinal",
prior=list(R=list(V=1, fix=1), G=list(G1=list(V=1, nu=0),
G2=list(V=1, nu=0),
G3=list(V=1, nu=0))))
Many thanks again for your (and everyone's) help thus far.

As I'm saying above, I suggest you remove the SUBJ:SESSION:TRIAL effect, if it corresponds, as you state to a "level-1 variance".

Another thing is that I don't think it is wise to use nu = 0 for such a model prior. I'd recommend using the parameter expansion of MCMCglmm's prior and especially a chi-square like prior that has been shown to perform quite well for such models:
G1 = list(V = 1, nu = 1000, alpha.mu = 0, alpha.V = 1)

And a last thing is that the "threshold" family is supposed to have better mixing and be faster than the "ordinal" family according to Jarrod Hadfield. You might want to give it a try (and beware that the + 1 should be removed from the ICC denominator when using "threshold", as per my previous email).

Best,
Pierre.

Kind regards,
Bernard
-----Original Message-----
Sent: Saturday, June 16, 2018 7:28 PM
Subject: Re: [R-sig-ME] [FORGED] Re: Using variance components of lmer for ICC computation in reliability study
Dear Bernard,

Thanks Pierre for the suggestion to use MCMCglmm. Very useful.

Glad this was helpful.

1) Can I ask why when taking the ratio of the variance components, the denominator is ICC = Vcomp / (sum(variance components of the model) + 1)? Why is the one added?

http://www.genetics.org/content/204/3/1281.supplemental
It is fairly technical. Trying to put it simply: there is an equivalence between a binomial/probit model and the threshold model. This is because a probit link is actually the CDF of a normal distribution (same goes for a logit and a logistic distribution): using a probit link and a binomial is equivalent to using a threshold model after adding a random noise normally distributed (this is Fig. S1 in our Suppl. File above).
And the variance of this random noise is thus the variance of a standard normal distribution which is 1 (or the variance of a standard logistic distribution, which is (pi^2)/3, for a logit link).
Computing the ICC on the liability scale "as if" a threshold model was used thus only requires to add that extra-variance on the denominator.

2) I have tried mixed ordinal modelling using "ordinal" or MCMCglmm,
and noticed the variance of the residuals of the model is not
produced. I have also read that the residual is assumed to be as you
mentioned (pi^2)/3. Is there a reason (maybe a not so technical one?)

Vrand / (Vrand + Vunits + 1)
Vrand / (Vrand + Vunits)
Sometimes this "extra-variance" (1 or (pi^2)/3) is indeed called "residual variance". This is because, as per my explanation above, probit and logit links can be seen as "adding a random noise then taking a threshold", this random noise is sometimes seen as a "residual error" in the model.
I hope this will clarify things for you. I'm afraid it's difficult to explain clearly the whys without going too much into the technicality of the models.
Cheers,
Pierre.

Post by Doran, Harold
However, I'm not sure how one would go about computing an ICC from ordinal data

Post by Doran, Harold
More generally, the best way to fit this kind of model is to use an
*ordinal* model, which assumes the responses are in increasing
sequence but does not assume the distance between levels (e.g. 1 vs 2,
2 vs 3 ...) is uniform. However, I'm not sure how one would go
about computing an ICC from ordinal data ... (the 'ordinal' package
is the place to look for the model-fitting procedures). Googling it
finds some stuff, but it seems that it doesn't necessarily apply to
complex designs ...
https://stats.stackexchange.com/questions/3539/inter-rater-reliabili
ty
-for-ordinal-or-interval-data
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3402032/

Post by Doran, Harold
That’s a helpful clarification, Rolf. However, with gaussian
normal errors in the linear model, we can’t *really* assume they
would asymptote at 1 or 10. My suspicion is that these are
likert-style ordered counts of some form, although the OP should
clarify. In which case, the 1 or 10 are limits with censoring, as
true values for some measured trait could exist outside those
boundaries (and I suspect the model is forming predicted values outside of 1 or 10).

Post by Doran, Harold
Well no, you¹re specification is not right because your variable
is not continuous as you note. Continuous means it is a real
number between -Inf/Inf and you have boundaries between 1 and 10.
So, you should not be using a linear model assuming the outcome is continuous.

I think that the foregoing is a bit misleading. For a variable to
be continuous it is not necessary for it to have a range from
-infinity to infinity.
The OP says that dv "is a continuous variable (scale 1-10)". It
is not clear to me what this means. The "obvious"/usual meaning
or interpretation would be that dv can take (only) the (positive
integer) values 1, 2, ..., 10. If this is so, then a continuous
model is not appropriate. (It should be noted however that people
in the social sciences do this sort of thing --- i.e. treat
discrete variables as continuous --- all the time.)
It is *possible* that dv can take values in the real interval
[1,10], in which case it *is* continuous, and a "continuous model"
is indeed appropriate.
The OP should clarify what the situation actually is.
cheers,
Rolf Turner
--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

Post by Bernard Liew
Dear Community,
I am doing a reliability study, using the methods of
https://www.ncbi.nlm.nih.gov/pubmed/28505546. I have a question
on the lmer formulation and the use of the variance components.
Background: I have 20 subjects, 2 fixed raters, 2 testing sessions, and
10 trials per sessions. my dependent variable is a continuous
variable (scale 1-10). Sessions are nested within each
subject-assessor combination. I desire a ICC (3) formulation of
inter-rater and inter-session reliability from the variance components.
lmer (dv ~ rater + (1|subj) + (1|subj:session), data = df)
1. is the model formulation right? and is my interpretation
of the variance components for ICC below right?
2. inter-rater ICC = var (subj) / (var(subj) + var
(residual)) # I read that the variation of raters will be lumped with the residual
3. inter-session ICC =( var (subj) + var (residual)) /( var
(subj) + var (subj:session) + var (residual)) some simulated
df = expand.grid(subj = c(1:20), rater = c(1:2), session =
c(1:2), trial = c(1:10)) df$vas = rnorm (nrow (df_sim), mean =
3, sd = 1.5)
I appreciate the kind response.

_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

David Duffy

2018-06-18 09:36:21 UTC

Post by Pierre de Villemereuil
Thing is: there is nothing like this for GLMMs. The lowest level of "residual variance" is basically the
distribution variance.
You could think there would be such a thing with a threshold model, but it turns out that the total variance of
the liability scale is non identifiable, so nothing there either.

There is some information when there are multiple thresholds.

Pierre de Villemereuil

2018-06-18 10:11:06 UTC

Hi David,

Post by David Duffy

There is some information when there are multiple thresholds.

You mean information to measure e.g. additive over-dispersion?

I agree but wouldn't you need quite a lot of thresholds (categories) for this to be measurable and not poses numerical issues? I have no practical experience in trying to account for that, so I'm curious if you have any experience in this.

Best,
Pierre.

Jarrod Hadfield

2018-06-18 10:22:38 UTC

Hi,

I think the residual variance is still non-identifiable with multiple
thresholds. In fact, this paper:

https://gsejournal.biomedcentral.com/track/pdf/10.1186/1297-9686-27-3-229

uses the non-identifiability in a 3 category problem to fix the
thresholds but estimate the residual variance because this is the same
as fixing the residual variance and estimating the free threshold.

Cheers,

Jarrod

Post by Pierre de Villemereuil
Hi David,

Post by David Duffy

There is some information when there are multiple thresholds.

You mean information to measure e.g. additive over-dispersion?
I agree but wouldn't you need quite a lot of thresholds (categories) for this to be measurable and not poses numerical issues? I have no practical experience in trying to account for that, so I'm curious if you have any experience in this.
Best,
Pierre.
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

David Duffy

2018-06-18 23:15:36 UTC

The situations I have most experience with is where there are fixed effects/multiple groups and the thresholds vary across groups - eg "spreading" of the thresholds in one group compared to the others may be interpretable as variance difference etc. In multidimensional setups, one tests the goodness of fit of the single threshold model by
fitting one-factor models to triads of variables at a time eg

Muthen B, Hofacker C (1988): Testing the assumptions underlying tetrachoric correlations. Psychometrika 53:563-578.

Maybe there is information about the residual variances in the 2-threshold model in such a setup?

Cheers, David.
________________________________________
From: R-sig-mixed-models [r-sig-mixed-models-***@r-project.org] on behalf of Jarrod Hadfield [***@ed.ac.uk]
Sent: Monday, 18 June 2018 8:22 PM
To: Pierre de Villemereuil; r-sig-mixed-***@r-project.org
Subject: Re: [R-sig-ME] Using variance components of lmer for ICC computation in reliability study

Hi,

I think the residual variance is still non-identifiable with multiple
thresholds. In fact, this paper:

https://gsejournal.biomedcentral.com/track/pdf/10.1186/1297-9686-27-3-229

uses the non-identifiability in a 3 category problem to fix the
thresholds but estimate the residual variance because this is the same
as fixing the residual variance and estimating the free threshold.

Cheers,

Jarrod

Post by Pierre de Villemereuil
Hi David,

Post by David Duffy

There is some information when there are multiple thresholds.

You mean information to measure e.g. additive over-dispersion?
I agree but wouldn't you need quite a lot of thresholds (categories) for this to be measurable and not poses numerical issues? I have no practical experience in trying to account for that, so I'm curious if you have any experience in this.
Best,
Pierre.
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
R-sig-mixed-***@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Pierre de Villemereuil

2018-06-19 07:38:04 UTC

Hi,

Post by David Duffy
Maybe there is information about the residual variances in the 2-threshold model in such a setup?

Maybe, but it would be quite hard (if possible at all) to distinguish from changes in the mean, wouldn't it? Unless the changes in the "spreading" are quite dramatic and located in the extreme categories with a strong "depletion" from the central one.

It's interesting to think about it though.

Cheers,
Pierre.

Jarrod Hadfield

2018-06-19 08:04:22 UTC

Hi,

No - the residual variance is non-identifiable in a threshold model
irrespective of the number of thresholds unless the thresholds are
constrained in some way (e.g. fully constrained as in the paper I
previously referenced). Strong depletion from the central category would
simply mean the two thresholds are close together.

Cheers,

Jarrod

Post by Pierre de Villemereuil
Hi,

Post by David Duffy
Maybe there is information about the residual variances in the 2-threshold model in such a setup?

Maybe, but it would be quite hard (if possible at all) to distinguish from changes in the mean, wouldn't it? Unless the changes in the "spreading" are quite dramatic and located in the extreme categories with a strong "depletion" from the central one.
It's interesting to think about it though.
Cheers,
Pierre.
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Pierre de Villemereuil

2018-06-19 08:11:02 UTC