"ape"
(Paradis et al 2004)
"nlme"
(Pinheiro et al 2012)
brain size and body size on primates ("primate_spec.txt"
, a tab separated text file), species-specific data obtained from the exercises above, includes information on within-species sampling (sample sizes, variances and standard errors
phylogeny ("primate_tree.phy"
, in phylip format) is taken from 10kTrees (Arnold et al 2010) and tailored to the data at hand
data.spec = read.table("primate_spec.txt", sep = "\t", header = TRUE) w.ols = lm(lg.brain ~ lg.body, data.spec, weights = brain_N)
or
library(nlme, quietly = T) w.ols = gls(lg.brain ~ lg.body, weights = ~1/brain_N, data.spec, method = "ML") summary(w.ols)
## Generalized least squares fit by maximum likelihood ## Model: lg.brain ~ lg.body ## Data: data.spec ## AIC BIC logLik ## 173.7 181 -83.83 ## ## Variance function: ## Structure: fixed weights ## Formula: ~1/brain_N ## ## Coefficients: ## Value Std.Error t-value p-value ## (Intercept) -2.0730 0.27967 -7.412 0 ## lg.body 0.7868 0.03262 24.119 0 ## ## Correlation: ## (Intr) ## lg.body -0.985 ## ## Standardized residuals: ## Min Q1 Med Q3 Max ## -2.5473 -0.6364 -0.2264 0.3108 3.3693 ## ## Residual standard error: 1.429 ## Degrees of freedom: 86 total; 84 residual
library(ape) tree = read.tree("primate_tree.phy") bm.tree <- corBrownian(phy = tree) # need species as row names: row.names(data.spec) = data.spec$species # pgls without weight: pgls = gls(lg.brain ~ lg.body, correlation = bm.tree, data.spec) # pgls with weight: w.pgls = gls(lg.brain ~ lg.body, weights = ~1/brain_N, correlation = bm.tree, data.spec) summary(w.pgls)
## Generalized least squares fit by REML ## Model: lg.brain ~ lg.body ## Data: data.spec ## AIC BIC logLik ## 184.1 191.4 -89.05 ## ## Correlation Structure: corBrownian ## Formula: ~1 ## Parameter estimate(s): ## numeric(0) ## Variance function: ## Structure: fixed weights ## Formula: ~1/brain_N ## ## Coefficients: ## Value Std.Error t-value p-value ## (Intercept) -1.3387 0.3839 -3.487 8e-04 ## lg.body 0.7095 0.0482 14.734 0e+00 ## ## Correlation: ## (Intr) ## lg.body -0.983 ## ## Standardized residuals: ## Min Q1 Med Q3 Max ## -0.94340 -0.35254 -0.13766 0.05078 1.53642 ## ## Residual standard error: 3.727 ## Degrees of freedom: 86 total; 84 residual
plot(data.spec$lg.body, data.spec$lg.brain, pch = 1, cex = 0.6 * sqrt(data.spec$brain_N), xlab = "log(body size)", ylab = "log(brain size)") abline(w.ols, col = "red") abline(pgls, col = "blue") abline(w.pgls, col = "green") legend(x = "bottomright", legend = c(paste("w.ols, AIC = ", round(AIC(w.ols), 2)), paste("pgls, AIC = ", round(AIC(pgls), 2)), paste("w.pgls, AIC = ", round(AIC(w.pgls), 2))), lty = 1, col = c("red", "blue", "green"))
The best model is the phylogenetic model that does not consider weights. The weighted model overemphasizes the influence of the difference in sample sizes. Note the high repeatability of both traits, which already indicate that adjusting for unequal sampling is not important.