Performs 2B Partial Least Squares allowing for leave-one-out cross-validation and removal of phylogenetic covariation (experimental).

pls2b(x, y, tree = NULL, LOOCV = FALSE, recompute = FALSE)

Arguments

x

A matrix with one or more variables as columns and observations as rows, representing the first block.

y

A matrix with one or more variables as columns and observations as rows, representing the second block.

tree

A "phylo" object containing a phylogenetic tree. Tip labels should match the row number and names from x and y.

LOOCV

Logical; whether to apply leave-one-out cross-validation.

recompute

Logical; whether to re-compute rotation matrix using the scores resulting from LOOCV.

Value

A "pls2b" or "phy_pls2b" object, containing:

  • $values: vector of singular values accounting for the covariation among blocks explained by each par of axes.

  • $xscores: a matrix with the scores of observations in the ordination axes of the first block.

  • $yscores: a matrix with the scores of observations in the ordination axes of the Second block.

  • $xrotation: matrix of vector coefficients for the first block.

  • $yrotation: matrix of vector coefficients for the second block.

  • $xcenter: the mean values of the original variables from the first block (or their phylogenetic mean, if tree is provided).

  • $ycenter: the mean values of the original variables from the second block (or their phylogenetic mean, if tree is provided).

  • $xtotvar: the sum of the variances of the variables from the second block.

  • $ytotvar: the sum of the variances of the variables from the first block.

  • $sdev: the standard deviations of the PLS-axes (i.e., the square roots of the eigenvalues of the covariance/correlation matrix).

Details

Starting with two blocks of variables measured for the same cases, two-blocks PLS finds the linear combination of variables on each block maximizing covariation with the variables on the other. In the context of morphospace, one or both of these blocks will usually be a series of shapes arranged as a 2-margins matrix.

It has been reported that PLS (as an algebraic equivalent of bgPCA) produces spurious covariation between blocks when the number of variables exceeds the number of observations (which is a common situation in geometric morphometrics analyses). This problem can be alleviated by carrying out a leave-one-out cross-validation (LOOCV; i.e., each observation is excluded from the calculation of PLS axes before its projection in the resulting ordination as a way to calculate its score).

If tree is supplied, rows of x and y are assumed to be linked by those phylogenetic relationships. In this case, the resulting axes maximize the residual covariation among blocks left after removing covariation among blocks accounted by phylogenetic history (assuming a Brownian model of evolution and 100% phylogenetic signal, which is equivalent to setting method = "BM" in phytools::phyl.pca()).

The phylogenetic version of pls2b() displays the same variational properties than phylogenetic PCA (i.e., centering on the phylogenetic mean; orientation of scores reflect non-phylogenetic covariation but their variance is not scaled and thus contain phylogenetic information; for the latter reason, variance of scores and eigenvalues differ; scores are correlated; etc; see Polly et al. 2013).

The number of axes in both blocks will be equal to the number of variables in the smallest block (i.e., the one with less variables), unless the number of observations is lower (in which case this will be equal to the number PLS axes).

References

Rohlf, F. J., & Corti, M. (2000). Use of two-block partial least-squares to study covariation in shape. Systematic Biology, 49(4), 740-753.

Zelditch, M. L., Swiderski, D. L., & Sheets, H. D. (2012). Geometric morphometrics for biologists: A primer. 2nd ed. Academic Press.

Polly, P. D., Lawing, A. M., Fabre, A. C., & Goswami, A. (2013). Phylogenetic principal components analysis and geometric morphometrics. Hystrix, 24(1), 33.

Monteiro, L. R. (2013). Morphometrics and the comparative method: studying the evolution of biological shape. Hystrix, the Italian Journal of Mammalogy, 24(1), 25-32.

Bookstein, F. L. (2019). Pathologies of between-groups principal components analysis in geometric morphometrics. Evolutionary Biology, 46(4), 271-302.

Examples

#load data and packages
library(geomorph)
data("tails")

#extract mean log sizes and shapes for all species, as well as the
#phylogenetic tree
shapes <- tails$shapes
sizes <- log(tails$sizes)
sp_shapes <- expected_shapes(shapes = shapes, x = tails$data$species)
sp_sizes <- tapply(X = sizes, INDEX = tails$data$species, FUN = mean)
tree <- tails$tree

#perform PLS between shape and size
pls <- pls2b(y = two.d.array(shapes), x = sizes)

#inspect results
names(pls) #the contents of the resulting object
#> [1] "yscores"   "yrotation" "ycenter"   "ytotvar"   "xscores"   "xrotation"
#> [7] "xcenter"   "xtotvar"   "sdev"     
plot(pls$xscores, pls$yscores) #ordination


#pls_shapes achieves identical results, but it is formatted to be used more
#easily when analyzing shape variables and its output is compatible with
#other functions from morphospace
pls2 <- pls_shapes(shapes = shapes, X = sizes)

names(pls2) #the contents of the resulting object
#> [1] "sdev"     "rotation" "x"        "x2"       "center"   "totvar"  
exp_var(pls2) #variance explained by each axis
#>        variance cummulative
#> PLS-1  74.05123    74.05123
#> PLS-2        NA          NA
#> PLS-3        NA          NA
#> PLS-4        NA          NA
#> PLS-5        NA          NA
#> PLS-6        NA          NA
#> PLS-7        NA          NA
#> PLS-8        NA          NA
#> PLS-9        NA          NA
#> PLS-10       NA          NA
plot(pls2$x2, pls2$x) #ordination



#perform phylogenetic PLS
ppls <- pls2b(y = two.d.array(sp_shapes), x = sp_sizes, tree = tree)

#inspect results
names(ppls) #the contents of the resulting object
#> [1] "yscores"   "yrotation" "ycenter"   "ytotvar"   "xscores"   "xrotation"
#> [7] "xcenter"   "xtotvar"   "sdev"     
plot(ppls$xscores, ppls$yscores) #ordination



#pls_shapes achieves identical results, but it is formatted to be used more
#easily when analyzing shape variables and its output is compatible with
#other functions from morphospace
ppls2 <- pls_shapes(shapes = sp_shapes, X = sp_sizes, tree = tree)

names(ppls2) #the contents of the resulting object
#> [1] "sdev"     "rotation" "x"        "x2"       "center"   "totvar"  
exp_var(ppls2) #variance explained by each axis
#>           variance cummulative
#> phyPLS-1  75.04955    75.04955
#> phyPLS-2        NA          NA
#> phyPLS-3        NA          NA
#> phyPLS-4        NA          NA
#> phyPLS-5        NA          NA
#> phyPLS-6        NA          NA
#> phyPLS-7        NA          NA
#> phyPLS-8        NA          NA
#> phyPLS-9        NA          NA
#> phyPLS-10       NA          NA
plot(ppls2$x2, ppls2$x) #ordination