Selection of dominant species from abundance releves

This script is designed to select dominant species from abundance records, and habitat if the information is available.

PRE_FATE.selectDominant(
  mat.observations,
  doRuleA = TRUE,
  rule.A1 = 10,
  rule.A2_quantile = 0.9,
  doRuleB = TRUE,
  rule.B1_percentage = 0.25,
  rule.B1_number = 5,
  rule.B2 = 0.5,
  doRuleC = FALSE,
  opt.doRobustness = FALSE,
  opt.robustness_percent = seq(0.1, 0.9, 0.1),
  opt.robustness_rep = 10,
  opt.doSitesSpecies = TRUE,
  opt.doPlot = TRUE
)

Arguments

mat.observations: a data.frame with at least 3 columns :
sites, species, abund
(and optionally, habitat)
(see Details)
doRuleA: default TRUE.
If TRUE, selection is done including constraints on number of occurrences
rule.A1: default 10.
If doRuleA = TRUE or doRuleC = TRUE, minimum number of releves required for each species
rule.A2_quantile: default 0.9.
If doRuleA = TRUE or doRuleC = TRUE, quantile corresponding to the minimum number of total occurrences required for each species (between 0 and 1)
doRuleB: default FALSE.
If TRUE, selection is done including constraints on relative abundances
rule.B1_percentage: default 0.25.
If doRuleB = TRUE, minimum relative abundance required for each species in at least rule.B1_number sites (between 0 and 1)
rule.B1_number: default 5.
If doRuleB = TRUE, minimum number of sites in which each species has relative abundance >= rule.B1_percentage
rule.B2: default 0.5.
If doRuleB = TRUE, minimum average relative abundance required for each species (between 0 and 1)
doRuleC: default FALSE.
If TRUE, selection is done including constraints on number of occurrences at the habitat level (with the values of rule.A1 and rule.A2_quantile)
opt.doRobustness: (optional) default FALSE.
If TRUE, selection is also done on subsets of mat.observations, keeping only a percentage of releves or sites, to visualize the robustness of the selection
opt.robustness_percent: (optional) default c(0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9).
If opt.doRobustness = TRUE, vector containing values between 0 and 1 corresponding to the percentages with which to build subsets to evaluate robustness
opt.robustness_rep: (optional) default 10.
If opt.doRobustness = TRUE, number of repetitions for each percentage value defined by opt.robustness_percent to evaluate robustness
opt.doSitesSpecies: (optional) default TRUE.
If TRUE, building of abundances / occurrences tables for selected species will be processed, saved and returned.
opt.doPlot: (optional) default TRUE.
If TRUE, plot(s) will be processed, otherwise only the calculation and reorganization of outputs will occur, be saved and returned.

Value

A list containing one vector, four or five data.frame objects with the following columns, and up to five ggplot2 objects :

species.selected

the names of the selected species

tab.rules

A1,A2,B1,B2, hab: if the rule has been used, if the species fullfills this condition or not
species: the concerned species
SELECTION: the summary of rules with which the species was selected, or not
SELECTED: TRUE if the species fullfills A1 and at least one other condition, FALSE otherwise

tab.robustness

...: same as tab.rules
type: the type of subset (either releves or sites)
percent: the concerned percentage of values extraction
rep: the repetition ID

tab.dom.AB

table containing sums of abundances for all selected species (sites in rows, species in columns)

tab.dom.PA

table containing counts of presences for all selected species (sites in rows, species in columns)

plot.A

ggplot2 object, representing the selection of species according to rules A1 and A2

plot.B

ggplot2 object, representing the selection of species according to rules B

plot.C

ggplot2 object, representing the selection of species according to rules C (A1 and A2 per habitat)

plot.pco

ggplot2 object, representing selected species with Principal Coordinates Analysis (see dudi.pco)

plot.robustness

ggplot2 object, representing the robustness of the selection of species for each rule

The information is written in PRE_FATE_DOMINANT_[...].csv files :

TABLE_complete: the complete table of all species and the selection rules described above (tab.rules)
TABLE_species: only the names / ID of the species selected
TABLE_sitesXspecies_AB: abundances table of selected species
TABLE_sitesXspecies_PA: presence/absence table of selected species

Up to six PRE_FATE_DOMINANT_[...].pdf files are also created :

STEP_1_rule_A: STEP_2_selectedSpecies_PHYLO
STEP_1_rule_B: STEP_2_selectedSpecies_PCO
STEP_1_rule_C: STEP_2_selectedSpecies_robustness

Details

This function provides a way to select dominant species based on presence/abundance sampling information.

Three rules can be applied to make the species selection :

A. Presence releves

both conditions must be fullfilled

on number of releves: the species should be found a minimum number of times (rule.A1)
This should ensure that the species has been given sufficient minimum sampling effort. This criterion MUST ALWAYS be fullfilled.
on number of sites: the species should be found in a certain number of sites, which corresponds to the quantile rule.A2_quantile of the total number of records per species
This should ensure that the species is covering all the studied area (or at least a determining part of it, assuming that the releves are well distributed throughout the area).

B. Abundance releves :

at least one of the two conditions is required

on dominancy: the species should be dominant (i.e. represent at least rule.B1_percentage % of the coverage of the site) in at least rule.B1_number sites
This should ensure the selection of species frequently abundant.
on average abundance: the species should have a mean relative abundance superior or equal to rule.B2
This should ensure the selection of species not frequent but representative of the sites in which it is found.

C. Presence releves per habitat :

If habitat information is available (e.g. type of environment : urban, desert, grassland... ; type of vegetation : shrubs, forest, alpine grasslands... ; etc), the same rules than A can be applied but for each habitat.
This should help to keep species that are not dominant at the large scale but could be representative of a specific habitat.

A table is created containing for each species whether or not it fullfills the conditions selected, for example :

This table is transformed into Euclidean distance matrix (with gowdis and quasieuclid functions)
to cluster and represent species (see .pdf output files) :

through phylogenetic tree (with hclust and as.phylo functions)
through Principal Component Analysis (with dudi.pco)

according to their selection rules :

A2 : spatial dominancy (widespread but poorly abundant)
B1 : local dominancy (relatively abundant or dominant in a certain number of sites)
B2 : local dominancy (not widespread but dominant in few sites)
C : habitat dominancy (not widespread but dominant in a specific habitat)
A2 & B1 : (widespread and relatively abundant)
A2 & B2 : (widespread and dominant in few sites)
A2 & B1 & B2 : (widespread and dominant)
B1 & B2 : (relatively widespread but dominant)

NB :
Species not meeting any criteria or only A1 are considered as "Not selected".
Priority is set to A2, B1 and B2 rules, rather than C. Hence, species selected according to A2, B1 and/or B2 can also meet criterion C while species selected according to C do not meet any of the three criteria.
Species selected according to one (or more) criterion but not meeting criterion A1 are also considered as "Not selected".

Author

Isabelle Boulangeat, Maya Guéguen

Examples


## Load example data
Champsaur_PFG = .loadData('Champsaur_PFG', 'RData')

## Species observations
tab = Champsaur_PFG$sp.observations

## No habitat, no robustness -------------------------------------------------
tab.occ = tab[, c('sites', 'species', 'abund')]
sp.SELECT = PRE_FATE.selectDominant(mat.observations = tab.occ)
names(sp.SELECT)
str(sp.SELECT$tab.rules)
str(sp.SELECT$tab.dom.PA)
plot(sp.SELECT$plot.A)
plot(sp.SELECT$plot.B$abs)
plot(sp.SELECT$plot.B$rel)

## Habitat, change parameters, no robustness (!quite long!) --------------------
if (FALSE) { # \dontrun{
tab.occ = tab[, c('sites', 'species', 'abund', 'habitat')]
sp.SELECT = PRE_FATE.selectDominant(mat.observations = tab.occ
                                    , doRuleA = TRUE
                                    , rule.A1 = 10
                                    , rule.A2_quantile = 0.9
                                    , doRuleB = TRUE
                                    , rule.B1_percentage = 0.2
                                    , rule.B1_number = 10
                                    , rule.B2 = 0.4
                                    , doRuleC = TRUE)
names(sp.SELECT)
str(sp.SELECT$tab.rules)
plot(sp.SELECT$plot.C)
plot(sp.SELECT$plot.pco$Axis1_Axis2)
plot(sp.SELECT$plot.pco$Axis1_Axis3)
} # }

## No habitat, robustness (!quite long!) --------------------
if (FALSE) { # \dontrun{
tab.occ = tab[, c('sites', 'species', 'abund')]
sp.SELECT = PRE_FATE.selectDominant(mat.observations = tab.occ
                                    , opt.doSitesSpecies = FALSE
                                    , opt.doRobustness = TRUE
                                    , opt.robustness_percent = seq(0.1,0.9,0.1)
                                    , opt.robustness_rep = 10)
names(sp.SELECT)
str(sp.SELECT$tab.robustness)
names(sp.SELECT$plot.robustness)
plot(sp.SELECT$plot.robustness$`All dataset`)
} # }

Arguments

Value

Details

See also

Author

Examples