Discussion:
[R-sig-ME] ordinal mixed model - which one to use?
Diana Michl
2018-05-29 18:30:47 UTC
Permalink
Dear List,

I'm fitting ordinal mixed models with package {ordinal}. I have a clmm
with 1 predictor (fixed effect, factor with 2 levels "woe" and "meta"),
2 random effects, and an ordinal outcome, ratings from 1-4. Items=82,
n=26. My question: Do I use

link="logit" or link="cloglog"? Or something else all together?

For all I know, cloglog is rather used when higher outcomes are more
likely, but it also depends on the model fit. I thought cloglog made
sense here b/c I have 53 cases of "woe" and 29 cases of "meta". "woe"
are conceptually more likely to be rated as 4 or 3 (higher events).
If this is incorrect, please correct me.

In my logit model, I get a ridiculously huge odds ratio - but much
better fit.
In my cloglog model, the odds ratio is still worryingly large, but less
a tenth, while the fit is much worse. I post the outputs below.

A few remarks: Overall, I don't understand the huge OR. I have an
extremely similar dataset (items=80, n=28) where the OR with the logit
model are just 4.7 and the cloglog OR are only 2.73. So that seems fine.
The difference between dataset 2 and the problematic one is the means:
Their difference is much bigger in the problematic dataset:

#mean of typ meta = 1.27

#mean of typ woe = 3.42

as opposed to dataset 2:

#mean of typ meta = 2.35

#mean of typ woe = 3.02
summary(m) Cumulative Link Mixed Model fitted with the Laplace
approximation formula: rat ~ typ + (1 | itemid) + (1 | Vp) data: nwmeta
link threshold nobs logLik AIC niter max.grad cond.H logit equidistant
2132 -1682.63 3375.25 215(1094) 2.68e-04 3.6e+01 Random effects: Groups
Name Variance Std.Dev. itemid (Intercept) 0.8829 0.9396 Vp (Intercept)
0.7831 0.8849 Number of groups: itemid 82, Vp 26 Coefficients: Estimate
Std. Error z value Pr(>|z|) typwoe 6.0994 0.2846 21.43 <2e-16 *** ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Threshold
coefficients: Estimate Std. Error z value threshold.1 1.73903 0.26937
6.456 spacing 1.96709 0.07206 27.299 OR(typwoe) = 429.57
summary(mcloglog) Cumulative Link Mixed Model fitted with the Laplace
approximation formula: rat ~ typ + (1 | itemid) + (1 | Vp) data: nwmeta
link threshold nobs logLik AIC niter max.grad cond.H cloglog flexible
2132 -1735.62 3483.24 352(2061) 1.48e-05 7.1e+01 Random effects: Groups
Name Variance Std.Dev. itemid (Intercept) 0.3774 0.6143 Vp (Intercept)
0.3413 0.5842 Number of groups: itemid 82, Vp 26 Coefficients: Estimate
Std. Error z value Pr(>|z|) typwoe 3.7495 0.1763 21.27 <2e-16 *** ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Threshold
coefficients: Estimate Std. Error z value 1|2 0.4984 0.1704 2.926 2|3
1.6293 0.1780 9.153 3|4 3.0036 0.1864 16.113



OR(typwoe) = 40.69
formula: link: threshold: mcloglog rat ~ typ + (1 | itemid) + (1 | Vp)
cloglog flexible m rat ~ typ + (1 | itemid) + (1 | Vp) logit flexible
no.par AIC logLik LR.stat df Pr(>Chisq) mcloglog 6 3483.2 -1735.6 m 6
3376.6 -1682.3 106.67 0


My sd seems fine at 1.26. Checking for outliers and several model
assumptions isn't possible for a clmm.

Thanks very much in advance for any input
--
Diana Michl


[[alternative HTML version deleted]]
Thierry Onkelinx
2018-05-30 14:52:33 UTC
Permalink
Dear Diana,

Posting in HTML makes the R output very hard to read.

The first thing that I do when I'm confronted with such large
coefficients is checking for quasi-complete separation.

Best regards,

Thierry

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
***@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////
Post by Diana Michl
Dear List,
I'm fitting ordinal mixed models with package {ordinal}. I have a clmm
with 1 predictor (fixed effect, factor with 2 levels "woe" and "meta"),
2 random effects, and an ordinal outcome, ratings from 1-4. Items=82,
n=26. My question: Do I use
link="logit" or link="cloglog"? Or something else all together?
For all I know, cloglog is rather used when higher outcomes are more
likely, but it also depends on the model fit. I thought cloglog made
sense here b/c I have 53 cases of "woe" and 29 cases of "meta". "woe"
are conceptually more likely to be rated as 4 or 3 (higher events).
If this is incorrect, please correct me.
In my logit model, I get a ridiculously huge odds ratio - but much
better fit.
In my cloglog model, the odds ratio is still worryingly large, but less
a tenth, while the fit is much worse. I post the outputs below.
A few remarks: Overall, I don't understand the huge OR. I have an
extremely similar dataset (items=80, n=28) where the OR with the logit
model are just 4.7 and the cloglog OR are only 2.73. So that seems fine.
#mean of typ meta = 1.27
#mean of typ woe = 3.42
#mean of typ meta = 2.35
#mean of typ woe = 3.02
summary(m) Cumulative Link Mixed Model fitted with the Laplace
approximation formula: rat ~ typ + (1 | itemid) + (1 | Vp) data: nwmeta
link threshold nobs logLik AIC niter max.grad cond.H logit equidistant
2132 -1682.63 3375.25 215(1094) 2.68e-04 3.6e+01 Random effects: Groups
Name Variance Std.Dev. itemid (Intercept) 0.8829 0.9396 Vp (Intercept)
0.7831 0.8849 Number of groups: itemid 82, Vp 26 Coefficients: Estimate
Std. Error z value Pr(>|z|) typwoe 6.0994 0.2846 21.43 <2e-16 *** ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Threshold
coefficients: Estimate Std. Error z value threshold.1 1.73903 0.26937
6.456 spacing 1.96709 0.07206 27.299 OR(typwoe) = 429.57
summary(mcloglog) Cumulative Link Mixed Model fitted with the Laplace
approximation formula: rat ~ typ + (1 | itemid) + (1 | Vp) data: nwmeta
link threshold nobs logLik AIC niter max.grad cond.H cloglog flexible
2132 -1735.62 3483.24 352(2061) 1.48e-05 7.1e+01 Random effects: Groups
Name Variance Std.Dev. itemid (Intercept) 0.3774 0.6143 Vp (Intercept)
0.3413 0.5842 Number of groups: itemid 82, Vp 26 Coefficients: Estimate
Std. Error z value Pr(>|z|) typwoe 3.7495 0.1763 21.27 <2e-16 *** ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Threshold
coefficients: Estimate Std. Error z value 1|2 0.4984 0.1704 2.926 2|3
1.6293 0.1780 9.153 3|4 3.0036 0.1864 16.113
OR(typwoe) = 40.69
formula: link: threshold: mcloglog rat ~ typ + (1 | itemid) + (1 | Vp)
cloglog flexible m rat ~ typ + (1 | itemid) + (1 | Vp) logit flexible
no.par AIC logLik LR.stat df Pr(>Chisq) mcloglog 6 3483.2 -1735.6 m 6
3376.6 -1682.3 106.67 0
My sd seems fine at 1.26. Checking for outliers and several model
assumptions isn't possible for a clmm.
Thanks very much in advance for any input
--
Diana Michl
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Diana Michl
2018-05-30 15:29:50 UTC
Permalink
Dear List, dear Thierry,

thank you for pointing out my formatting got screwed up and still
fighting your way through! I'm resending my email below. Complete
separation: Well, not quite, but I do have few cases with few cells:



Conceptually, this is wanted and makes perfect sense. If this is the
reason, I'm not sure what to do. It still seems strange to me that
because one's cases are pretty straight forward and results are, too,
this should make modelling so difficult or impossible... Thank you and
kind regards
Post by Thierry Onkelinx
Dear Diana,
Posting in HTML makes the R output very hard to read.
The first thing that I do when I'm confronted with such large
coefficients is checking for quasi-complete separation.
Best regards,
Thierry
ir. Thierry Onkelinx
Statisticus / Statistician
Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
AND FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be
///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no
more than asking him to perform a post-mortem examination: he may be
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does
not ensure that a reasonable answer can be extracted from a given body
of data. ~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////
-------- Weitergeleitete Nachricht --------
Betreff: ordinal mixed model - which one to use?
Datum: Tue, 29 May 2018 20:30:47 +0200
Von: Diana Michl <***@uni-potsdam.de>
An: r-sig-mixed-***@r-project.org <r-sig-mixed-***@r-project.org>



Dear List,

I'm fitting ordinal mixed models with package {ordinal}. I have a clmm
with 1 predictor (fixed effect, factor with 2 levels "woe" and "meta"),
2 random effects, and an ordinal outcome, ratings from 1-4. Items=82,
n=26. My question: Do I use

link="logit" or link="cloglog"? Or something else all together?

For all I know, cloglog is rather used when higher outcomes are more
likely, but it also depends on the model fit. I thought cloglog made
sense here b/c I have 53 cases of "woe" and 29 cases of "meta". "woe"
are conceptually more likely to be rated as 4 or 3 (higher events).
If this is incorrect, please correct me.

In my logit model, I get a ridiculously huge odds ratio - but much
better fit.
In my cloglog model, the odds ratio is still worryingly large, but less
a tenth, while the fit is much worse. I post the outputs below.

A few remarks: Overall, I don't understand the huge OR. I have an
extremely similar dataset (items=80, n=28) where the OR with the logit
model are just 4.7 and the cloglog OR are only 2.73. So that seems fine.
The difference between dataset 2 and the problematic one is the means:
Their difference is much bigger in the problematic dataset:

#mean of typ meta = 1.27

#mean of typ woe = 3.42

as opposed to dataset 2:

#mean of typ meta = 2.35

#mean of typ woe = 3.02

cloglog model:


comparison:


My sd seems fine at 1.26. Checking for outliers and several model
assumptions isn't possible for a clmm.

Thanks very much in advance for any input
--
Diana Michl
Paul Johnson
2018-10-10 15:21:50 UTC
Permalink
Hey, everybody

On Diana's link question, Would a comparison of the AIC or BIC be
informative for choice of link? We've been exploring that in our group and
majority says yes.

I don't know reasons people prefer cloglog. Except when using binary model
as proxy for hazard/survival, I wonder what other reasons for cloglog.

Paul Johnson
University of Kansas
Post by Diana Michl
Dear List,
I'm fitting ordinal mixed models with package {ordinal}. I have a clmm
with 1 predictor (fixed effect, factor with 2 levels "woe" and "meta"),
2 random effects, and an ordinal outcome, ratings from 1-4. Items=82,
n=26. My question: Do I use
link="logit" or link="cloglog"? Or something else all together?
For all I know, cloglog is rather used when higher outcomes are more
likely, but it also depends on the model fit. I thought cloglog made
sense here b/c I have 53 cases of "woe" and 29 cases of "meta". "woe"
are conceptually more likely to be rated as 4 or 3 (higher events).
If this is incorrect, please correct me.
In my logit model, I get a ridiculously huge odds ratio - but much
better fit.
In my cloglog model, the odds ratio is still worryingly large, but less
a tenth, while the fit is much worse. I post the outputs below.
A few remarks: Overall, I don't understand the huge OR. I have an
extremely similar dataset (items=80, n=28) where the OR with the logit
model are just 4.7 and the cloglog OR are only 2.73. So that seems fine.
#mean of typ meta = 1.27
#mean of typ woe = 3.42
#mean of typ meta = 2.35
#mean of typ woe = 3.02
summary(m) Cumulative Link Mixed Model fitted with the Laplace
approximation formula: rat ~ typ + (1 | itemid) + (1 | Vp) data: nwmeta
link threshold nobs logLik AIC niter max.grad cond.H logit equidistant
2132 -1682.63 3375.25 215(1094) 2.68e-04 3.6e+01 Random effects: Groups
Name Variance Std.Dev. itemid (Intercept) 0.8829 0.9396 Vp (Intercept)
0.7831 0.8849 Number of groups: itemid 82, Vp 26 Coefficients: Estimate
Std. Error z value Pr(>|z|) typwoe 6.0994 0.2846 21.43 <2e-16 *** ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Threshold
coefficients: Estimate Std. Error z value threshold.1 1.73903 0.26937
6.456 spacing 1.96709 0.07206 27.299 OR(typwoe) = 429.57
summary(mcloglog) Cumulative Link Mixed Model fitted with the Laplace
approximation formula: rat ~ typ + (1 | itemid) + (1 | Vp) data: nwmeta
link threshold nobs logLik AIC niter max.grad cond.H cloglog flexible
2132 -1735.62 3483.24 352(2061) 1.48e-05 7.1e+01 Random effects: Groups
Name Variance Std.Dev. itemid (Intercept) 0.3774 0.6143 Vp (Intercept)
0.3413 0.5842 Number of groups: itemid 82, Vp 26 Coefficients: Estimate
Std. Error z value Pr(>|z|) typwoe 3.7495 0.1763 21.27 <2e-16 *** ---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Threshold
coefficients: Estimate Std. Error z value 1|2 0.4984 0.1704 2.926 2|3
1.6293 0.1780 9.153 3|4 3.0036 0.1864 16.113
OR(typwoe) = 40.69
formula: link: threshold: mcloglog rat ~ typ + (1 | itemid) + (1 | Vp)
cloglog flexible m rat ~ typ + (1 | itemid) + (1 | Vp) logit flexible
no.par AIC logLik LR.stat df Pr(>Chisq) mcloglog 6 3483.2 -1735.6 m 6
3376.6 -1682.3 106.67 0
My sd seems fine at 1.26. Checking for outliers and several model
assumptions isn't possible for a clmm.
Thanks very much in advance for any input
--
Diana Michl
[[alternative HTML version deleted]]
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]

Loading...