Performs 2B Partial Least Squares allowing for leave-one-out cross-validation and removal of phylogenetic covariation (experimental).
pls2b(x, y, tree = NULL, LOOCV = FALSE, recompute = FALSE)
A matrix with one or more variables as columns and observations as rows, representing the first block.
A matrix with one or more variables as columns and observations as rows, representing the second block.
A "phylo"
object containing a phylogenetic tree. Tip
labels should match the row number and names from x
and y
.
Logical; whether to apply leave-one-out cross-validation.
Logical; whether to re-compute rotation matrix using the scores resulting from LOOCV.
A "pls2b"
or "phy_pls2b"
object, containing:
$values:
vector of singular values accounting for the
covariation among blocks explained by each par of axes.
$xscores:
a matrix with the scores of observations in the
ordination axes of the first block.
$yscores:
a matrix with the scores of observations in the
ordination axes of the Second block.
$xrotation:
matrix of vector coefficients for the first
block.
$yrotation:
matrix of vector coefficients for the second
block.
$xcenter:
the mean values of the original variables from
the first block (or their phylogenetic mean, if tree
is
provided).
$ycenter:
the mean values of the original variables from
the second block (or their phylogenetic mean, if tree
is
provided).
$xtotvar:
the sum of the variances of the variables from the
second block.
$ytotvar:
the sum of the variances of the variables from the
first block.
$sdev:
the standard deviations of the PLS-axes (i.e., the
square roots of the eigenvalues of the covariance/correlation matrix).
Starting with two blocks of variables measured for the same
cases, two-blocks PLS finds the linear combination of variables on each
block maximizing covariation with the variables on the other. In the
context of morphospace
, one or both of these blocks will usually
be a series of shapes arranged as a 2-margins matrix.
It has been reported that PLS (as an algebraic equivalent of bgPCA) produces spurious covariation between blocks when the number of variables exceeds the number of observations (which is a common situation in geometric morphometrics analyses). This problem can be alleviated by carrying out a leave-one-out cross-validation (LOOCV; i.e., each observation is excluded from the calculation of PLS axes before its projection in the resulting ordination as a way to calculate its score).
If tree
is supplied, rows of x
and y
are assumed to
be linked by those phylogenetic relationships. In this case, the resulting
axes maximize the residual covariation among blocks left after removing
covariation among blocks accounted by phylogenetic history (assuming a
Brownian model of evolution and 100% phylogenetic signal, which is
equivalent to setting method = "BM"
in phytools::phyl.pca()
).
The phylogenetic version of pls2b()
displays the same variational
properties than phylogenetic PCA (i.e., centering on the phylogenetic mean;
orientation of scores reflect non-phylogenetic covariation but their
variance is not scaled and thus contain phylogenetic information; for the
latter reason, variance of scores and eigenvalues differ; scores are
correlated; etc; see Polly et al. 2013).
The number of axes in both blocks will be equal to the number of variables in the smallest block (i.e., the one with less variables), unless the number of observations is lower (in which case this will be equal to the number PLS axes).
Rohlf, F. J., & Corti, M. (2000). Use of two-block partial least-squares to study covariation in shape. Systematic Biology, 49(4), 740-753.
Zelditch, M. L., Swiderski, D. L., & Sheets, H. D. (2012). Geometric morphometrics for biologists: A primer. 2nd ed. Academic Press.
Polly, P. D., Lawing, A. M., Fabre, A. C., & Goswami, A. (2013). Phylogenetic principal components analysis and geometric morphometrics. Hystrix, 24(1), 33.
Monteiro, L. R. (2013). Morphometrics and the comparative method: studying the evolution of biological shape. Hystrix, the Italian Journal of Mammalogy, 24(1), 25-32.
Bookstein, F. L. (2019). Pathologies of between-groups principal components analysis in geometric morphometrics. Evolutionary Biology, 46(4), 271-302.
#load data and packages
library(geomorph)
data("tails")
#extract mean log sizes and shapes for all species, as well as the
#phylogenetic tree
shapes <- tails$shapes
sizes <- log(tails$sizes)
sp_shapes <- expected_shapes(shapes = shapes, x = tails$data$species)
sp_sizes <- tapply(X = sizes, INDEX = tails$data$species, FUN = mean)
tree <- tails$tree
#perform PLS between shape and size
pls <- pls2b(y = two.d.array(shapes), x = sizes)
#inspect results
names(pls) #the contents of the resulting object
#> [1] "yscores" "yrotation" "ycenter" "ytotvar" "xscores" "xrotation"
#> [7] "xcenter" "xtotvar" "sdev"
plot(pls$xscores, pls$yscores) #ordination
#pls_shapes achieves identical results, but it is formatted to be used more
#easily when analyzing shape variables and its output is compatible with
#other functions from morphospace
pls2 <- pls_shapes(shapes = shapes, X = sizes)
names(pls2) #the contents of the resulting object
#> [1] "sdev" "rotation" "x" "x2" "center" "totvar"
exp_var(pls2) #variance explained by each axis
#> variance cummulative
#> PLS-1 74.05123 74.05123
#> PLS-2 NA NA
#> PLS-3 NA NA
#> PLS-4 NA NA
#> PLS-5 NA NA
#> PLS-6 NA NA
#> PLS-7 NA NA
#> PLS-8 NA NA
#> PLS-9 NA NA
#> PLS-10 NA NA
plot(pls2$x2, pls2$x) #ordination
#perform phylogenetic PLS
ppls <- pls2b(y = two.d.array(sp_shapes), x = sp_sizes, tree = tree)
#inspect results
names(ppls) #the contents of the resulting object
#> [1] "yscores" "yrotation" "ycenter" "ytotvar" "xscores" "xrotation"
#> [7] "xcenter" "xtotvar" "sdev"
plot(ppls$xscores, ppls$yscores) #ordination
#pls_shapes achieves identical results, but it is formatted to be used more
#easily when analyzing shape variables and its output is compatible with
#other functions from morphospace
ppls2 <- pls_shapes(shapes = sp_shapes, X = sp_sizes, tree = tree)
names(ppls2) #the contents of the resulting object
#> [1] "sdev" "rotation" "x" "x2" "center" "totvar"
exp_var(ppls2) #variance explained by each axis
#> variance cummulative
#> phyPLS-1 75.04955 75.04955
#> phyPLS-2 NA NA
#> phyPLS-3 NA NA
#> phyPLS-4 NA NA
#> phyPLS-5 NA NA
#> phyPLS-6 NA NA
#> phyPLS-7 NA NA
#> phyPLS-8 NA NA
#> phyPLS-9 NA NA
#> phyPLS-10 NA NA
plot(ppls2$x2, ppls2$x) #ordination