Discussion:
[R-sig-ME] Unclear output from MCMCglmm with categorical predictors
roee maor
2018-11-20 18:08:10 UTC
Permalink
Dear Jarrod (and list),

Following your previous comment I added "random = ~ Binomial" to my model
to allow for a phylogenetic analysis.
This causes convergence problems: the trace plots show increasing
oscillations along each chain (although no directional trends, so it's not
a burn-in issue). Also, the posterior samples are highly correlated,
residual variance estimates are >10^3 and threshold estimates are high (>20
on the latent scale).
Surprisingly (to me), predictors that are strongly significant in the
non-phylogenetic model lose their effect in the phylogenetic model (I tried
several alternative parameter configurations).

It seems that this model attributes the explained variance to phylogeny
alone.
Can anyone explain what is going on here? Am I specifying the model poorly
or just asking my data more than it can answer?

I tried to overcome this issue by using a fully resolved variant of the
phylogeny, which only improved things slightly.
I also changed the random effect to "random=~Family" or "random=~Order",
which reduced the variance and threshold estimates to more acceptable
levels (<10), but still no significant predictors (and I'm not sure how the
algorithm calculates covariance between higher taxa in the phylogeny).
Separately I tried parameter expanded prior: "prior = list(R = list(V=1,
fix=1), G = list(G1 = list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))". That
didn't help, and messing with priors for this reason feels like poor
practice.

This is the model:
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ Binomial,
prior = list(R = list(fix=1, V=1), G =
list(G1 = list(V=1, nu=0.002))),
ginverse = list(Binomial=INphylo$Ainv),
burnin = 150000, nitt = 2650001, thin =
2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ =
TRUE,
verbose = FALSE)

The data I use looks like this (not all variables appear in each model):

str(Tdata)
'data.frame': 1389 obs. of 10 variables:
$ Binomial : Factor w/ 1421 levels "Abrocoma_bennettii",..: 1 2
3 4 5 6 7 8 9 10 ...
$ Order : Factor w/ 27 levels "Afrosoricida",..: 24 24 24
24 24 3 24 24 2 2 ...
$ Family : Factor w/ 126 levels "Abrocomidae",..: 1 26 26 26
26 46 74 87 10 10 ...
$ Activity : Factor w/ 3 levels "1","2","3": 1 3 2 2 2 3 2 3
1 3 ...
$ Habitat : Factor w/ 6 levels "Aqua","Arbo",..: 5 5 5 5 5 5
5 5 5 5 ...
$ Diet : Factor w/ 3 levels "Faun","Herb",..: 2 3 3 3 3
1 3 2 2 2 ...
$ Mass : num 250.5 24.9 34.5 38.9 24.5 ...
$ Max.Temp : num 22 16.6 19.1 19.8 17.2 ...
$ Annual.Precip : num 166 645 558 903 1665 ...

Any advice would be much appreciated!
Many thanks,
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL

[[alternative HTML version deleted]]
HADFIELD Jarrod
2018-11-20 20:01:44 UTC
Permalink
Hi,

Most likely the phylogenetic heritability (the phylogenetic variance / the phylogenetic +residual variance) is approaching one resulting in numerical difficulties. Probably the best thing is to assume that the phylogenetic heritability equals 1 and use the reduced phylogenetic mixed model implementation. This allows the phylogenetic heritability to be equal to 1 without causing numerical issues. At some point I will integrate these models into the main MCMCglmm package, but for now you can download it from here: http://jarrod.bio.ed.ac.uk/MCMCglmmRAM_2.24.tar.gz.

Change the name of the ‘’Binomial’ column to ‘animal’ and fit:

T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ animal
prior = list(R = list(fix=1, V=1e-15), G = list(G1 = list(V=1, nu=0.002))),
pedigree = tree,
reduced=TRUE,
burnin = 150000, nitt = 2650001, thin = 2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = FALSE)

You should need fewer iterations.

Cheers,

Jarrod


On 20 Nov 2018, at 18:08, roee maor <***@gmail.com<mailto:***@gmail.com>> wrote:

Dear Jarrod (and list),

Following your previous comment I added "random = ~ Binomial" to my model to allow for a phylogenetic analysis.
This causes convergence problems: the trace plots show increasing oscillations along each chain (although no directional trends, so it's not a burn-in issue). Also, the posterior samples are highly correlated, residual variance estimates are >10^3 and threshold estimates are high (>20 on the latent scale).
Surprisingly (to me), predictors that are strongly significant in the non-phylogenetic model lose their effect in the phylogenetic model (I tried several alternative parameter configurations).

It seems that this model attributes the explained variance to phylogeny alone.
Can anyone explain what is going on here? Am I specifying the model poorly or just asking my data more than it can answer?

I tried to overcome this issue by using a fully resolved variant of the phylogeny, which only improved things slightly.
I also changed the random effect to "random=~Family" or "random=~Order", which reduced the variance and threshold estimates to more acceptable levels (<10), but still no significant predictors (and I'm not sure how the algorithm calculates covariance between higher taxa in the phylogeny).
Separately I tried parameter expanded prior: "prior = list(R = list(V=1, fix=1), G = list(G1 = list(V=1, nu=1, alpha.mu<http://alpha.mu/>=0, alpha.V=1000)))". That didn't help, and messing with priors for this reason feels like poor practice.

This is the model:
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ Binomial,
prior = list(R = list(fix=1, V=1), G = list(G1 = list(V=1, nu=0.002))),
ginverse = list(Binomial=INphylo$Ainv),
burnin = 150000, nitt = 2650001, thin = 2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = FALSE)

The data I use looks like this (not all variables appear in each model):

str(Tdata)
'data.frame': 1389 obs. of 10 variables:
$ Binomial : Factor w/ 1421 levels "Abrocoma_bennettii",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Order : Factor w/ 27 levels "Afrosoricida",..: 24 24 24 24 24 3 24 24 2 2 ...
$ Family : Factor w/ 126 levels "Abrocomidae",..: 1 26 26 26 26 46 74 87 10 10 ...
$ Activity : Factor w/ 3 levels "1","2","3": 1 3 2 2 2 3 2 3 1 3 ...
$ Habitat : Factor w/ 6 levels "Aqua","Arbo",..: 5 5 5 5 5 5 5 5 5 5 ...
$ Diet : Factor w/ 3 levels "Faun","Herb",..: 2 3 3 3 3 1 3 2 2 2 ...
$ Mass : num 250.5 24.9 34.5 38.9 24.5 ...
$ Max.Temp : num 22 16.6 19.1 19.8 17.2 ...
$ Annual.Precip : num 166 645 558 903 1665 ...

Any advice would be much appreciated!
Many thanks,
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL
roee maor
2018-11-21 14:09:20 UTC
Permalink
Hi Jarrod,
Many thanks for your reply.

I couldn't install the tarball on R v3.4.3 or v3.5.1 so sourced the files
directly to the workspace.
Post by roee maor
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ animal,
prior = list(R = list(fix=1, V=1e-15), G = list(G1 =
list(V=1, nu=0.002))),
pedigree = datatree,
reduced = TRUE,
burnin = 50000, nitt = 750001, thin = 700,
family = "threshold",
data = Rdata,
pl = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = TRUE)

It returns some errors about missing functions "is.positive.definite" and
Post by roee maor
library("corpcor", lib.loc="~/R/win-library/3.5")
library("MatrixModels", lib.loc="~/R/win-library/3.5")
but I can't figure this one out:
'Error in .C("MCMCglmm", as.double(data$MCMC_y),
as.double(data$MCMC_y.additional), :
C symbol name "MCMCglmm" not in load table'
Detaching these packages doesn't necessarily cause that same error to
appear although I execute the exact same code.
Also, several attempts (same code again) caused a fatal error and automatic
session termination (info for a similar session below if interesting).

I tried to use the fully bifurcating tree as an experiment but that made no
difference.

Any ideas what this last error means?

Thanks!
Roi
Post by roee maor
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United
Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United
Kingdom.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] corpcor_1.6.9 Matrix_1.2-14 phytools_0.6-60 maps_3.3.0
ape_5.2

loaded via a namespace (and not attached):
[1] igraph_1.2.2 Rcpp_0.12.19 magrittr_1.5
MASS_7.3-50 mnormt_1.5-5
[6] scatterplot3d_0.3-41 lattice_0.20-35 quadprog_1.5-5
fastmatch_1.1-0 tools_3.5.1
[11] parallel_3.5.1 grid_3.5.1 nlme_3.1-137
clusterGeneration_1.3.4 phangorn_2.4.0
[16] plotrix_3.7-4 coda_0.19-2 yaml_2.2.0
numDeriv_2016.8-1 animation_2.5
[21] compiler_3.5.1 combinat_0.0-8 expm_0.999-3
pkgconfig_2.0.2
Post by roee maor
Hi,
Most likely the phylogenetic heritability (the phylogenetic variance / the
phylogenetic +residual variance) is approaching one resulting in numerical
difficulties. Probably the best thing is to assume that the phylogenetic
heritability equals 1 and use the reduced phylogenetic mixed model
implementation. This allows the phylogenetic heritability to be equal to 1
without causing numerical issues. At some point I will integrate these
models into the main MCMCglmm package, but for now you can download it from
here: http://jarrod.bio.ed.ac.uk/MCMCglmmRAM_2.24.tar.gz.
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ animal
prior = list(R = list(fix=1, V=1e-15), G =
list(G1 = list(V=1, nu=0.002))),
pedigree = tree,
reduced=TRUE,
burnin = 150000, nitt = 2650001, thin = 2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = FALSE)
You should need fewer iterations.
Cheers,
Jarrod
Dear Jarrod (and list),
Following your previous comment I added "random = ~ Binomial" to my model
to allow for a phylogenetic analysis.
This causes convergence problems: the trace plots show increasing
oscillations along each chain (although no directional trends, so it's not
a burn-in issue). Also, the posterior samples are highly correlated,
residual variance estimates are >10^3 and threshold estimates are high (>20
on the latent scale).
Surprisingly (to me), predictors that are strongly significant in the
non-phylogenetic model lose their effect in the phylogenetic model (I tried
several alternative parameter configurations).
It seems that this model attributes the explained variance to phylogeny alone.
Can anyone explain what is going on here? Am I specifying the model
poorly or just asking my data more than it can answer?
I tried to overcome this issue by using a fully resolved variant of the
phylogeny, which only improved things slightly.
I also changed the random effect to "random=~Family" or "random=~Order",
which reduced the variance and threshold estimates to more acceptable
levels (<10), but still no significant predictors (and I'm not sure how the
algorithm calculates covariance between higher taxa in the phylogeny).
Separately I tried parameter expanded prior: "prior = list(R = list(V=1,
fix=1), G = list(G1 = list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))". That
didn't help, and messing with priors for this reason feels like poor
practice.
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ Binomial,
prior = list(R = list(fix=1, V=1), G =
list(G1 = list(V=1, nu=0.002))),
ginverse = list(Binomial=INphylo$Ainv),
burnin = 150000, nitt = 2650001, thin = 2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = FALSE)
str(Tdata)
$ Binomial : Factor w/ 1421 levels "Abrocoma_bennettii",..: 1 2
3 4 5 6 7 8 9 10 ...
$ Order : Factor w/ 27 levels "Afrosoricida",..: 24 24 24
24 24 3 24 24 2 2 ...
$ Family : Factor w/ 126 levels "Abrocomidae",..: 1 26 26
26 26 46 74 87 10 10 ...
$ Activity : Factor w/ 3 levels "1","2","3": 1 3 2 2 2 3 2 3 1 3 ...
$ Habitat : Factor w/ 6 levels "Aqua","Arbo",..: 5 5 5 5 5 5 5 5 5 5 ...
$ Diet : Factor w/ 3 levels "Faun","Herb",..: 2 3 3 3 3 1 3 2 2 2 ...
$ Mass : num 250.5 24.9 34.5 38.9 24.5 ...
$ Max.Temp : num 22 16.6 19.1 19.8 17.2 ...
$ Annual.Precip : num 166 645 558 903 1665 ...
Any advice would be much appreciated!
Many thanks,
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL

[[alternative HTML version deleted]]
HADFIELD Jarrod
2018-11-21 15:05:14 UTC
Permalink
Hi,

You could upload the tarball to winbuilder (https://win-builder.r-project.org/) and build a Windows source package.

Cheers,

Jarrod


On 21/11/2018 14:09, roee maor wrote:
Hi Jarrod,
Many thanks for your reply.

I couldn't install the tarball on R v3.4.3 or v3.5.1 so sourced the files directly to the workspace.
Post by roee maor
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ animal,
prior = list(R = list(fix=1, V=1e-15), G = list(G1 = list(V=1, nu=0.002))),
pedigree = datatree,
reduced = TRUE,
burnin = 50000, nitt = 750001, thin = 700,
family = "threshold",
data = Rdata,
pl = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = TRUE)
Post by roee maor
library("corpcor", lib.loc="~/R/win-library/3.5")
library("MatrixModels", lib.loc="~/R/win-library/3.5")
but I can't figure this one out:
'Error in .C("MCMCglmm", as.double(data$MCMC_y), as.double(data$MCMC_y.additional), :
C symbol name "MCMCglmm" not in load table'
Detaching these packages doesn't necessarily cause that same error to appear although I execute the exact same code.
Also, several attempts (same code again) caused a fatal error and automatic session termination (info for a similar session below if interesting).

I tried to use the fully bifurcating tree as an experiment but that made no difference.

Any ideas what this last error means?

Thanks!
Roi
Post by roee maor
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] corpcor_1.6.9 Matrix_1.2-14 phytools_0.6-60 maps_3.3.0 ape_5.2

loaded via a namespace (and not attached):
[1] igraph_1.2.2 Rcpp_0.12.19 magrittr_1.5 MASS_7.3-50 mnormt_1.5-5
[6] scatterplot3d_0.3-41 lattice_0.20-35 quadprog_1.5-5 fastmatch_1.1-0 tools_3.5.1
[11] parallel_3.5.1 grid_3.5.1 nlme_3.1-137 clusterGeneration_1.3.4 phangorn_2.4.0
[16] plotrix_3.7-4 coda_0.19-2 yaml_2.2.0 numDeriv_2016.8-1 animation_2.5
[21] compiler_3.5.1 combinat_0.0-8 expm_0.999-3 pkgconfig_2.0.2

On Tue, 20 Nov 2018 at 20:01, HADFIELD Jarrod <***@ed.ac.uk<mailto:***@ed.ac.uk>> wrote:
Hi,

Most likely the phylogenetic heritability (the phylogenetic variance / the phylogenetic +residual variance) is approaching one resulting in numerical difficulties. Probably the best thing is to assume that the phylogenetic heritability equals 1 and use the reduced phylogenetic mixed model implementation. This allows the phylogenetic heritability to be equal to 1 without causing numerical issues. At some point I will integrate these models into the main MCMCglmm package, but for now you can download it from here: http://jarrod.bio.ed.ac.uk/MCMCglmmRAM_2.24.tar.gz.

Change the name of the ‘’Binomial’ column to ‘animal’ and fit:

T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ animal
prior = list(R = list(fix=1, V=1e-15), G = list(G1 = list(V=1, nu=0.002))),
pedigree = tree,
reduced=TRUE,
burnin = 150000, nitt = 2650001, thin = 2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = FALSE)

You should need fewer iterations.

Cheers,

Jarrod


On 20 Nov 2018, at 18:08, roee maor <***@gmail.com<mailto:***@gmail.com>> wrote:

Dear Jarrod (and list),

Following your previous comment I added "random = ~ Binomial" to my model to allow for a phylogenetic analysis.
This causes convergence problems: the trace plots show increasing oscillations along each chain (although no directional trends, so it's not a burn-in issue). Also, the posterior samples are highly correlated, residual variance estimates are >10^3 and threshold estimates are high (>20 on the latent scale).
Surprisingly (to me), predictors that are strongly significant in the non-phylogenetic model lose their effect in the phylogenetic model (I tried several alternative parameter configurations).

It seems that this model attributes the explained variance to phylogeny alone.
Can anyone explain what is going on here? Am I specifying the model poorly or just asking my data more than it can answer?

I tried to overcome this issue by using a fully resolved variant of the phylogeny, which only improved things slightly.
I also changed the random effect to "random=~Family" or "random=~Order", which reduced the variance and threshold estimates to more acceptable levels (<10), but still no significant predictors (and I'm not sure how the algorithm calculates covariance between higher taxa in the phylogeny).
Separately I tried parameter expanded prior: "prior = list(R = list(V=1, fix=1), G = list(G1 = list(V=1, nu=1, alpha.mu<http://alpha.mu/>=0, alpha.V=1000)))". That didn't help, and messing with priors for this reason feels like poor practice.

This is the model:
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ Binomial,
prior = list(R = list(fix=1, V=1), G = list(G1 = list(V=1, nu=0.002))),
ginverse = list(Binomial=INphylo$Ainv),
burnin = 150000, nitt = 2650001, thin = 2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = FALSE)

The data I use looks like this (not all variables appear in each model):

str(Tdata)
'data.frame': 1389 obs. of 10 variables:
$ Binomial : Factor w/ 1421 levels "Abrocoma_bennettii",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Order : Factor w/ 27 levels "Afrosoricida",..: 24 24 24 24 24 3 24 24 2 2 ...
$ Family : Factor w/ 126 levels "Abrocomidae",..: 1 26 26 26 26 46 74 87 10 10 ...
$ Activity : Factor w/ 3 levels "1","2","3": 1 3 2 2 2 3 2 3 1 3 ...
$ Habitat : Factor w/ 6 levels "Aqua","Arbo",..: 5 5 5 5 5 5 5 5 5 5 ...
$ Diet : Factor w/ 3 levels "Faun","Herb",..: 2 3 3 3 3 1 3 2 2 2 ...
$ Mass : num 250.5 24.9 34.5 38.9 24.5 ...
$ Max.Temp : num 22 16.6 19.1 19.8 17.2 ...
$ Annual.Precip : num 166 645 558 903 1665 ...

Any advice would be much appreciated!
Many thanks,
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL
Ben Bolker
2018-11-21 15:20:43 UTC
Permalink
If you need to build a windows binary on win-builder, you probably
need to (1) unpack the tarball (2) change the DESCRIPTION file so that
your e-mail is listed as the maintainer's e-mail address (3) repack the
tarball (4) upload to winbuilder.

Otherwise the original maintainer will be sent an e-mail telling them
the binary is built and giving a link where they can download it ...

cheers
Ben Bolker
Post by HADFIELD Jarrod
Hi,
You could upload the tarball to winbuilder (https://win-builder.r-project.org/) and build a Windows source package.
Cheers,
Jarrod
Hi Jarrod,
Many thanks for your reply.
I couldn't install the tarball on R v3.4.3 or v3.5.1 so sourced the files directly to the workspace.
Post by roee maor
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ animal,
prior = list(R = list(fix=1, V=1e-15), G = list(G1 = list(V=1, nu=0.002))),
pedigree = datatree,
reduced = TRUE,
burnin = 50000, nitt = 750001, thin = 700,
family = "threshold",
data = Rdata,
pl = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = TRUE)
Post by roee maor
library("corpcor", lib.loc="~/R/win-library/3.5")
library("MatrixModels", lib.loc="~/R/win-library/3.5")
C symbol name "MCMCglmm" not in load table'
Detaching these packages doesn't necessarily cause that same error to appear although I execute the exact same code.
Also, several attempts (same code again) caused a fatal error and automatic session termination (info for a similar session below if interesting).
I tried to use the fully bifurcating tree as an experiment but that made no difference.
Any ideas what this last error means?
Thanks!
Roi
Post by roee maor
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
[1] stats graphics grDevices utils datasets methods base
[1] corpcor_1.6.9 Matrix_1.2-14 phytools_0.6-60 maps_3.3.0 ape_5.2
[1] igraph_1.2.2 Rcpp_0.12.19 magrittr_1.5 MASS_7.3-50 mnormt_1.5-5
[6] scatterplot3d_0.3-41 lattice_0.20-35 quadprog_1.5-5 fastmatch_1.1-0 tools_3.5.1
[11] parallel_3.5.1 grid_3.5.1 nlme_3.1-137 clusterGeneration_1.3.4 phangorn_2.4.0
[16] plotrix_3.7-4 coda_0.19-2 yaml_2.2.0 numDeriv_2016.8-1 animation_2.5
[21] compiler_3.5.1 combinat_0.0-8 expm_0.999-3 pkgconfig_2.0.2
Hi,
Most likely the phylogenetic heritability (the phylogenetic variance / the phylogenetic +residual variance) is approaching one resulting in numerical difficulties. Probably the best thing is to assume that the phylogenetic heritability equals 1 and use the reduced phylogenetic mixed model implementation. This allows the phylogenetic heritability to be equal to 1 without causing numerical issues. At some point I will integrate these models into the main MCMCglmm package, but for now you can download it from here: http://jarrod.bio.ed.ac.uk/MCMCglmmRAM_2.24.tar.gz.
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ animal
prior = list(R = list(fix=1, V=1e-15), G = list(G1 = list(V=1, nu=0.002))),
pedigree = tree,
reduced=TRUE,
burnin = 150000, nitt = 2650001, thin = 2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = FALSE)
You should need fewer iterations.
Cheers,
Jarrod
Dear Jarrod (and list),
Following your previous comment I added "random = ~ Binomial" to my model to allow for a phylogenetic analysis.
This causes convergence problems: the trace plots show increasing oscillations along each chain (although no directional trends, so it's not a burn-in issue). Also, the posterior samples are highly correlated, residual variance estimates are >10^3 and threshold estimates are high (>20 on the latent scale).
Surprisingly (to me), predictors that are strongly significant in the non-phylogenetic model lose their effect in the phylogenetic model (I tried several alternative parameter configurations).
It seems that this model attributes the explained variance to phylogeny alone.
Can anyone explain what is going on here? Am I specifying the model poorly or just asking my data more than it can answer?
I tried to overcome this issue by using a fully resolved variant of the phylogeny, which only improved things slightly.
I also changed the random effect to "random=~Family" or "random=~Order", which reduced the variance and threshold estimates to more acceptable levels (<10), but still no significant predictors (and I'm not sure how the algorithm calculates covariance between higher taxa in the phylogeny).
Separately I tried parameter expanded prior: "prior = list(R = list(V=1, fix=1), G = list(G1 = list(V=1, nu=1, alpha.mu<http://alpha.mu/>=0, alpha.V=1000)))". That didn't help, and messing with priors for this reason feels like poor practice.
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ Binomial,
prior = list(R = list(fix=1, V=1), G = list(G1 = list(V=1, nu=0.002))),
ginverse = list(Binomial=INphylo$Ainv),
burnin = 150000, nitt = 2650001, thin = 2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = FALSE)
str(Tdata)
$ Binomial : Factor w/ 1421 levels "Abrocoma_bennettii",..: 1 2 3 4 5 6 7 8 9 10 ...
$ Order : Factor w/ 27 levels "Afrosoricida",..: 24 24 24 24 24 3 24 24 2 2 ...
$ Family : Factor w/ 126 levels "Abrocomidae",..: 1 26 26 26 26 46 74 87 10 10 ...
$ Activity : Factor w/ 3 levels "1","2","3": 1 3 2 2 2 3 2 3 1 3 ...
$ Habitat : Factor w/ 6 levels "Aqua","Arbo",..: 5 5 5 5 5 5 5 5 5 5 ...
$ Diet : Factor w/ 3 levels "Faun","Herb",..: 2 3 3 3 3 1 3 2 2 2 ...
$ Mass : num 250.5 24.9 34.5 38.9 24.5 ...
$ Max.Temp : num 22 16.6 19.1 19.8 17.2 ...
$ Annual.Precip : num 166 645 558 903 1665 ...
Any advice would be much appreciated!
Many thanks,
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
roee maor
2018-11-22 22:59:33 UTC
Permalink
Dear Jarrod and Ben,
Thank you both for the advice.
Apologies if I accidentally caused anyone to receive an unexpected email.

Running the model specified above returns the error: "non-reduced nodes do
not appear first". The error persists when using a fully-bifurcating tree
so the issue is not tree polytomies.
I tried to go through the source code line-by-line to locate the problem
but got lost in the many loops and conditional statements. This error
message is conditioned on a match() output that also returns a warning that
the two arguments matched are unequal in length: one corresponds to the
tips while the other includes the tree's internal nodes as well.

As always, any help is much appreciated.
Best,
Roi
Post by HADFIELD Jarrod
Hi,
You could upload the tarball to winbuilder (
https://win-builder.r-project.org/) and build a Windows source package.
Cheers,
Jarrod
Hi Jarrod,
Many thanks for your reply.
I couldn't install the tarball on R v3.4.3 or v3.5.1 so sourced the files
directly to the workspace.
Post by roee maor
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ animal,
prior = list(R = list(fix=1, V=1e-15), G = list(G1 = list(V=1, nu=0.002))),
pedigree = datatree,
reduced = TRUE,
burnin = 50000, nitt = 750001, thin = 700,
family = "threshold",
data = Rdata,
pl = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = TRUE)
It returns some errors about missing functions "is.positive.definite" and
Post by roee maor
library("corpcor", lib.loc="~/R/win-library/3.5")
library("MatrixModels", lib.loc="~/R/win-library/3.5")
'Error in .C("MCMCglmm", as.double(data$MCMC_y),
C symbol name "MCMCglmm" not in load table'
Detaching these packages doesn't necessarily cause that same error to
appear although I execute the exact same code.
Also, several attempts (same code again) caused a fatal error and
automatic session termination (info for a similar session below if
interesting).
I tried to use the fully bifurcating tree as an experiment but that made no difference.
Any ideas what this last error means?
Thanks!
Roi
Post by roee maor
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United
Kingdom.1252 LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
[1] stats graphics grDevices utils datasets methods base
[1] corpcor_1.6.9 Matrix_1.2-14 phytools_0.6-60 maps_3.3.0
ape_5.2
[1] igraph_1.2.2 Rcpp_0.12.19 magrittr_1.5
MASS_7.3-50 mnormt_1.5-5
[6] scatterplot3d_0.3-41 lattice_0.20-35 quadprog_1.5-5
fastmatch_1.1-0 tools_3.5.1
[11] parallel_3.5.1 grid_3.5.1 nlme_3.1-137
clusterGeneration_1.3.4 phangorn_2.4.0
[16] plotrix_3.7-4 coda_0.19-2 yaml_2.2.0
numDeriv_2016.8-1 animation_2.5
[21] compiler_3.5.1 combinat_0.0-8 expm_0.999-3
pkgconfig_2.0.2
Post by roee maor
Hi,
Most likely the phylogenetic heritability (the phylogenetic variance /
the phylogenetic +residual variance) is approaching one resulting in
numerical difficulties. Probably the best thing is to assume that the
phylogenetic heritability equals 1 and use the reduced phylogenetic mixed
model implementation. This allows the phylogenetic heritability to be equal
to 1 without causing numerical issues. At some point I will integrate these
models into the main MCMCglmm package, but for now you can download it from
here: http://jarrod.bio.ed.ac.uk/MCMCglmmRAM_2.24.tar.gz.
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ animal
prior = list(R = list(fix=1, V=1e-15), G =
list(G1 = list(V=1, nu=0.002))),
pedigree = tree,
reduced=TRUE,
burnin = 150000, nitt = 2650001, thin = 2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = FALSE)
You should need fewer iterations.
Cheers,
Jarrod
Dear Jarrod (and list),
Following your previous comment I added "random = ~ Binomial" to my model
to allow for a phylogenetic analysis.
This causes convergence problems: the trace plots show increasing
oscillations along each chain (although no directional trends, so it's not
a burn-in issue). Also, the posterior samples are highly correlated,
residual variance estimates are >10^3 and threshold estimates are high (>20
on the latent scale).
Surprisingly (to me), predictors that are strongly significant in the
non-phylogenetic model lose their effect in the phylogenetic model (I tried
several alternative parameter configurations).
It seems that this model attributes the explained variance to phylogeny alone.
Can anyone explain what is going on here? Am I specifying the model
poorly or just asking my data more than it can answer?
I tried to overcome this issue by using a fully resolved variant of the
phylogeny, which only improved things slightly.
I also changed the random effect to "random=~Family" or "random=~Order",
which reduced the variance and threshold estimates to more acceptable
levels (<10), but still no significant predictors (and I'm not sure how the
algorithm calculates covariance between higher taxa in the phylogeny).
Separately I tried parameter expanded prior: "prior = list(R = list(V=1,
fix=1), G = list(G1 = list(V=1, nu=1, alpha.mu=0, alpha.V=1000)))". That
didn't help, and messing with priors for this reason feels like poor
practice.
T1 <- MCMCglmm(Activity ~ -1 + log(Mass) + Max.Temp * Annual.Precip,
random = ~ Binomial,
prior = list(R = list(fix=1, V=1), G =
list(G1 = list(V=1, nu=0.002))),
ginverse = list(Binomial=INphylo$Ainv),
burnin = 150000, nitt = 2650001, thin = 2500,
family = "threshold", data = Tdata,
pl = TRUE, pr = TRUE, saveX = TRUE, saveZ = TRUE,
verbose = FALSE)
str(Tdata)
$ Binomial : Factor w/ 1421 levels "Abrocoma_bennettii",..: 1
2 3 4 5 6 7 8 9 10 ...
$ Order : Factor w/ 27 levels "Afrosoricida",..: 24 24 24
24 24 3 24 24 2 2 ...
$ Family : Factor w/ 126 levels "Abrocomidae",..: 1 26 26
26 26 46 74 87 10 10 ...
$ Activity : Factor w/ 3 levels "1","2","3": 1 3 2 2 2 3 2 3 1 3 ...
$ Habitat : Factor w/ 6 levels "Aqua","Arbo",..: 5 5 5 5 5 5 5 5 5 5 ...
$ Diet : Factor w/ 3 levels "Faun","Herb",..: 2 3 3 3 3 1 3 2 2 2 ...
$ Mass : num 250.5 24.9 34.5 38.9 24.5 ...
$ Max.Temp : num 22 16.6 19.1 19.8 17.2 ...
$ Annual.Precip : num 166 645 558 903 1665 ...
Any advice would be much appreciated!
Many thanks,
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
--
Roi Maor
PhD candidate
School of Zoology, Tel Aviv University
Centre for Biodiversity and Environment Research, UCL

[[alternative HTML version deleted]]
Loading...