roee maor
2018-11-20 18:08:10 UTC
Dear Jarrod (and list),
Following your previous comment I added "random = ~ Binomial" to my model
to allow for a phylogenetic analysis.
This causes convergence problems: the trace plots show increasing
oscillations along each chain (although no directional trends, so it's not
a burn-in issue). Also, the posterior samples are highly correlated,
residual variance estimates are >10^3 and threshold estimates are high (>20
on the latent scale).
Surprisingly (to me), predictors that are strongly significant in the
non-phylogenetic model lose their effect in the phylogenetic model (I tried
several alternative parameter configurations).
It seems that this model attributes the explained variance to phylogeny
alone.
Can anyone explain what is going on here? Am I specifying the model poorly
or just asking my data more than it can answer?
I tried to overcome this issue by using a fully resolved variant of the
phylogeny, which only improved things slightly.
I also changed the random effect to "random=~Family" or "random=~Order",
which reduced the variance and threshold estimates to more acceptable
levels (<10), but still no significant predictors (and I'm not sure how the
algorithm calculates covariance between higher taxa in the phylogeny).
Separately I tried parameter expanded prior: "prior = list(R = list(V=1,
fix=1), G = list(G1 = list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))". That
didn't help, and messing with priors for this reason feels like poor
practice.
This is the model:
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ Binomial,
prior = list(R = list(fix=1, V=1), G =
list(G1 = list(V=1, nu=0.002))),
ginverse = list(Binomial=INphylo$Ainv),
burnin = 150000, nitt = 2650001, thin =
2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ =
TRUE,
verbose = FALSE)
The data I use looks like this (not all variables appear in each model):
str(Tdata)
'data.frame': 1389 obs. of 10 variables:
$ Binomial : Factor w/ 1421 levels "Abrocoma_bennettii",..: 1 2
3 4 5 6 7 8 9 10 ...
$ Order : Factor w/ 27 levels "Afrosoricida",..: 24 24 24
24 24 3 24 24 2 2 ...
$ Family : Factor w/ 126 levels "Abrocomidae",..: 1 26 26 26
26 46 74 87 10 10 ...
$ Activity : Factor w/ 3 levels "1","2","3": 1 3 2 2 2 3 2 3
1 3 ...
$ Habitat : Factor w/ 6 levels "Aqua","Arbo",..: 5 5 5 5 5 5
5 5 5 5 ...
$ Diet : Factor w/ 3 levels "Faun","Herb",..: 2 3 3 3 3
1 3 2 2 2 ...
$ Mass : num 250.5 24.9 34.5 38.9 24.5 ...
$ Max.Temp : num 22 16.6 19.1 19.8 17.2 ...
$ Annual.Precip : num 166 645 558 903 1665 ...
Any advice would be much appreciated!
Many thanks,
Following your previous comment I added "random = ~ Binomial" to my model
to allow for a phylogenetic analysis.
This causes convergence problems: the trace plots show increasing
oscillations along each chain (although no directional trends, so it's not
a burn-in issue). Also, the posterior samples are highly correlated,
residual variance estimates are >10^3 and threshold estimates are high (>20
on the latent scale).
Surprisingly (to me), predictors that are strongly significant in the
non-phylogenetic model lose their effect in the phylogenetic model (I tried
several alternative parameter configurations).
It seems that this model attributes the explained variance to phylogeny
alone.
Can anyone explain what is going on here? Am I specifying the model poorly
or just asking my data more than it can answer?
I tried to overcome this issue by using a fully resolved variant of the
phylogeny, which only improved things slightly.
I also changed the random effect to "random=~Family" or "random=~Order",
which reduced the variance and threshold estimates to more acceptable
levels (<10), but still no significant predictors (and I'm not sure how the
algorithm calculates covariance between higher taxa in the phylogeny).
Separately I tried parameter expanded prior: "prior = list(R = list(V=1,
fix=1), G = list(G1 = list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))". That
didn't help, and messing with priors for this reason feels like poor
practice.
This is the model:
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ Binomial,
prior = list(R = list(fix=1, V=1), G =
list(G1 = list(V=1, nu=0.002))),
ginverse = list(Binomial=INphylo$Ainv),
burnin = 150000, nitt = 2650001, thin =
2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ =
TRUE,
verbose = FALSE)
The data I use looks like this (not all variables appear in each model):
str(Tdata)
'data.frame': 1389 obs. of 10 variables:
$ Binomial : Factor w/ 1421 levels "Abrocoma_bennettii",..: 1 2
3 4 5 6 7 8 9 10 ...
$ Order : Factor w/ 27 levels "Afrosoricida",..: 24 24 24
24 24 3 24 24 2 2 ...
$ Family : Factor w/ 126 levels "Abrocomidae",..: 1 26 26 26
26 46 74 87 10 10 ...
$ Activity : Factor w/ 3 levels "1","2","3": 1 3 2 2 2 3 2 3
1 3 ...
$ Habitat : Factor w/ 6 levels "Aqua","Arbo",..: 5 5 5 5 5 5
5 5 5 5 ...
$ Diet : Factor w/ 3 levels "Faun","Herb",..: 2 3 3 3 3
1 3 2 2 2 ...
$ Mass : num 250.5 24.9 34.5 38.9 24.5 ...
$ Max.Temp : num 22 16.6 19.1 19.8 17.2 ...
$ Annual.Precip : num 166 645 558 903 1665 ...
Any advice would be much appreciated!
Many thanks,
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL
[[alternative HTML version deleted]]
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL
[[alternative HTML version deleted]]