R/PRE_FATE.speciesDistance.R
PRE_FATE.speciesDistance.Rd
This script is designed to create a distance matrix between species, combining functional distances (based on functional trait values) and niche overlap (based on co-occurrence of species).
PRE_FATE.speciesDistance(
mat.traits,
mat.overlap.option,
mat.overlap.object,
opt.weights = NULL,
opt.maxPercent.NA = 0,
opt.maxPercent.similarSpecies = 0.25,
opt.min.sd = 0.3
)
a data.frame
with at least 3 columns :
species
the ID of each studied species
GROUP
a factor variable containing grouping information to
divide the species into data subsets (see
Details
)
...
one column for each functional trait
a string
corresponding to the way to
calculate the distance between species based on niche overlap (either
PCA
, raster
or dist
, see
Details
)
three options, depending on the value of
mat.overlap.option
:
(PCA
option) a list
with 2 elements :
tab.dom.PA
a matrix
or data.frame
with
sites in rows and species in columns, containing either NA
,
0
or 1
(see PRE_FATE.selectDominant
)
tab.env
a matrix
or data.frame
with
sites in rows and environmental variables in columns
(raster
option) a data.frame
with 2 columns :
species
the ID of each studied species
raster
path to raster file with species distribution
(dist
option) a similarity structure representing the
niche overlap between each pair of species. It can be a dist
object, a niolap
object, or simply a matrix
.
(optional) default NULL
.
A vector
of two double
(between 0
and 1
)
corresponding to the weights for traits and overlap distances
respectively. They must sum up to 1
.
(optional) default 0
.
Maximum
percentage of missing values (NA
) allowed for each trait (between
0
and 1
)
(optional) default 0.25
.
Maximum percentage of similar species (same value)
allowed for each trait (between 0
and 1
)
(optional) default 0.5
.
Minimum
standard deviation allowed for each trait (trait unit)
A list
of 3 dist
objects (functional distances,
overlap distances, and combination of both according to the weights given
(or not) by the opt.weights
parameter), each of them corresponding
to : the distance between each pair of species, or a list
of
dist
objects, one for each GROUP
value.
The information for the combination of both distances is written in
PRE_FATE_DOMINANT_speciesDistance.csv
file (or if necessary, one
file is created for each group).
This function allows to obtain a distance matrix between species, based on two types of distance information :
Functional traits :
The GROUP
column is required if species must be separated
to have one final distance matrix per GROUP
value.
If the
column is missing, all species will be considered as part of a unique
dataset.
The traits can be qualitative or quantitative, but previously
identified as such
(i.e. with the use of functions such as
as.numeric
, as.factor
and ordered
).
Functional distance matrix is calculated with Gower dissimilarity,
using the gowdis
function.
This function allows NA
values.
However, too many
missing values lead to misleading results. Hence, 3 parameters allow the
user to play with the place given to missing values, and therefore the
selection of traits that will be used for the distance computation :
traits with too many missing values are removed
traits with too many similar values are removed
traits with too little variability are removed
Niche overlap :
If PCA
option is selected, the degree of niche overlap will
be computed using the ecospat.niche.overlap
.
If raster
option is selected, the degree of niche overlap will
be computed using the niche.overlap
.
Functional distances and niche overlap informations are then combined according to the following formula :
$$\text{mat.DIST}_{sub-group} = \frac{[\text{wei.FUNC} * \text{mat.FUNCTIONAL}_{sub-group} + \text{wei.OVER} * \text{mat.OVERLAP}_{sub-group}]}{[ \text{wei.FUNC} + \text{wei.OVER} ]}$$
with :
$$\text{wei.FUNC} = \text{opt.weights}[1]$$ $$\text{wei.OVER} = \text{opt.weights}[2]$$
if opt.weights
is given, otherwise :
$$\text{wei.FUNC} = n_{traits}$$ $$\text{wei.OVER} = 1$$
meaning that distance matrix obtained from functional information is weighted by the number of traits used.
## Load example data
Champsaur_PFG = .loadData('Champsaur_PFG', 'RData')
## Species traits
tab.traits = Champsaur_PFG$sp.traits
tab.traits = tab.traits[, c('species', 'GROUP', 'MATURITY', 'LONGEVITY'
, 'HEIGHT', 'DISPERSAL', 'LIGHT', 'NITROGEN')]
str(tab.traits)
## Species niche overlap (dissimilarity distances)
tab.overlap = 1 - Champsaur_PFG$mat.overlap ## transform into similarity
tab.overlap[1:5, 1:5]
## Give warnings -------------------------------------------------------------
sp.DIST = PRE_FATE.speciesDistance(mat.traits = tab.traits
, mat.overlap.option = 'dist'
, mat.overlap.object = tab.overlap)
str(sp.DIST)
## Change parameters to allow more NAs (and change traits used) --------------
sp.DIST = PRE_FATE.speciesDistance(mat.traits = tab.traits
, mat.overlap.option = 'dist'
, mat.overlap.object = tab.overlap
, opt.maxPercent.NA = 0.05
, opt.maxPercent.similarSpecies = 0.3
, opt.min.sd = 0.3)
str(sp.DIST)
if (FALSE) {
require(foreach); require(ggplot2); require(ggdendro)
pp = foreach(x = names(sp.DIST$mat.ALL)) %do%
{
hc = hclust(sp.DIST$mat.ALL[[x]])
pp = ggdendrogram(hc, rotate = TRUE) +
labs(title = paste0('Hierarchical clustering based on species distance '
, ifelse(length(names(sp.DIST$mat.ALL)) > 1
, paste0('(group ', x, ')')
, '')))
return(pp)
}
plot(pp[[1]])
plot(pp[[2]])
plot(pp[[3]])
}