This script is designed to create a distance matrix between species, combining functional distances (based on functional trait values) and niche overlap (based on co-occurrence of species).

PRE_FATE.speciesDistance(
  mat.traits,
  mat.overlap.option,
  mat.overlap.object,
  opt.weights = NULL,
  opt.maxPercent.NA = 0,
  opt.maxPercent.similarSpecies = 0.25,
  opt.min.sd = 0.3
)

Arguments

mat.traits

a data.frame with at least 3 columns :

species

the ID of each studied species

GROUP

a factor variable containing grouping information to divide the species into data subsets (see Details)

...

one column for each functional trait

mat.overlap.option

a string corresponding to the way to calculate the distance between species based on niche overlap (either PCA, raster or dist, see Details)

mat.overlap.object

three options, depending on the value of mat.overlap.option :

  • (PCA option) a list with 2 elements :

    tab.dom.PA

    a matrix or data.frame with sites in rows and species in columns, containing either NA, 0 or 1 (see PRE_FATE.selectDominant)

    tab.env

    a matrix or data.frame with sites in rows and environmental variables in columns

  • (raster option) a data.frame with 2 columns :

    species

    the ID of each studied species

    raster

    path to raster file with species distribution

  • (dist option) a similarity structure representing the niche overlap between each pair of species. It can be a dist object, a niolap object, or simply a matrix.

opt.weights

(optional) default NULL.
A vector of two double (between 0 and 1) corresponding to the weights for traits and overlap distances respectively. They must sum up to 1.

opt.maxPercent.NA

(optional) default 0.
Maximum percentage of missing values (NA) allowed for each trait (between 0 and 1)

opt.maxPercent.similarSpecies

(optional) default 0.25.
Maximum percentage of similar species (same value) allowed for each trait (between 0 and 1)

opt.min.sd

(optional) default 0.5.
Minimum standard deviation allowed for each trait (trait unit)

Value

A list of 3 dist objects (functional distances, overlap distances, and combination of both according to the weights given (or not) by the opt.weights parameter), each of them corresponding to : the distance between each pair of species, or a list of dist objects, one for each GROUP value.


The information for the combination of both distances is written in PRE_FATE_DOMINANT_speciesDistance.csv file (or if necessary, one file is created for each group).

Details

This function allows to obtain a distance matrix between species, based on two types of distance information :

  1. Functional traits :

    • The GROUP column is required if species must be separated to have one final distance matrix per GROUP value.
      If the column is missing, all species will be considered as part of a unique dataset.

    • The traits can be qualitative or quantitative, but previously identified as such
      (i.e. with the use of functions such as as.numeric, as.factor and ordered).

    • Functional distance matrix is calculated with Gower dissimilarity, using the gowdis function.

    • This function allows NA values.
      However, too many missing values lead to misleading results. Hence, 3 parameters allow the user to play with the place given to missing values, and therefore the selection of traits that will be used for the distance computation :

      opt.maxPercent.NA

      traits with too many missing values are removed

      opt.maxPercent
      .similarSpecies

      traits with too many similar values are removed

      opt.min.sd

      traits with too little variability are removed

  2. Niche overlap :

    • If PCA option is selected, the degree of niche overlap will be computed using the ecospat.niche.overlap.

    • If raster option is selected, the degree of niche overlap will be computed using the niche.overlap.


Functional distances and niche overlap informations are then combined according to the following formula :

$$\text{mat.DIST}_{sub-group} = \frac{[\text{wei.FUNC} * \text{mat.FUNCTIONAL}_{sub-group} + \text{wei.OVER} * \text{mat.OVERLAP}_{sub-group}]}{[ \text{wei.FUNC} + \text{wei.OVER} ]}$$

with :

$$\text{wei.FUNC} = \text{opt.weights}[1]$$ $$\text{wei.OVER} = \text{opt.weights}[2]$$

if opt.weights is given, otherwise :

$$\text{wei.FUNC} = n_{traits}$$ $$\text{wei.OVER} = 1$$

meaning that distance matrix obtained from functional information is weighted by the number of traits used.

Author

Maya Guéguen

Examples


## Load example data
Champsaur_PFG = .loadData('Champsaur_PFG', 'RData')

## Species traits
tab.traits = Champsaur_PFG$sp.traits
tab.traits = tab.traits[, c('species', 'GROUP', 'MATURITY', 'LONGEVITY'
                            , 'HEIGHT', 'DISPERSAL', 'LIGHT', 'NITROGEN')]
str(tab.traits)

## Species niche overlap (dissimilarity distances)
tab.overlap = 1 - Champsaur_PFG$mat.overlap ## transform into similarity
tab.overlap[1:5, 1:5]

## Give warnings -------------------------------------------------------------
sp.DIST = PRE_FATE.speciesDistance(mat.traits = tab.traits
                                   , mat.overlap.option = 'dist'
                                   , mat.overlap.object = tab.overlap)
str(sp.DIST)

## Change parameters to allow more NAs (and change traits used) --------------
sp.DIST = PRE_FATE.speciesDistance(mat.traits = tab.traits
                                   , mat.overlap.option = 'dist'
                                   , mat.overlap.object = tab.overlap
                                   , opt.maxPercent.NA = 0.05
                                   , opt.maxPercent.similarSpecies = 0.3
                                   , opt.min.sd = 0.3)
str(sp.DIST)

if (FALSE) {
require(foreach); require(ggplot2); require(ggdendro)
pp = foreach(x = names(sp.DIST$mat.ALL)) %do%
  {
    hc = hclust(sp.DIST$mat.ALL[[x]])
    pp = ggdendrogram(hc, rotate = TRUE) +
      labs(title = paste0('Hierarchical clustering based on species distance '
                          , ifelse(length(names(sp.DIST$mat.ALL)) > 1
                                   , paste0('(group ', x, ')')
                                   , '')))
    return(pp)
  }
plot(pp[[1]])
plot(pp[[2]])
plot(pp[[3]])
}