R/PRE_FATE.selectDominant.R
PRE_FATE.selectDominant.Rd
This script is designed to select dominant species from abundance records, and habitat if the information is available.
PRE_FATE.selectDominant(
mat.observations,
doRuleA = TRUE,
rule.A1 = 10,
rule.A2_quantile = 0.9,
doRuleB = TRUE,
rule.B1_percentage = 0.25,
rule.B1_number = 5,
rule.B2 = 0.5,
doRuleC = FALSE,
opt.doRobustness = FALSE,
opt.robustness_percent = seq(0.1, 0.9, 0.1),
opt.robustness_rep = 10,
opt.doSitesSpecies = TRUE,
opt.doPlot = TRUE
)
a data.frame
with at least 3 columns : sites
, species
, abund
(and optionally, habitat
)
(see Details
)
default TRUE
.
If TRUE
, selection
is done including constraints on number of occurrences
default 10
.
If doRuleA = TRUE
or
doRuleC = TRUE
, minimum number of releves required for each species
default 0.9
.
If doRuleA = TRUE
or doRuleC = TRUE
, quantile corresponding to the minimum number of
total occurrences required for each species (between 0
and 1
)
default FALSE
.
If TRUE
, selection is done
including constraints on relative abundances
default 0.25
.
If doRuleB = TRUE
,
minimum relative abundance required for each species in at least
rule.B1_number
sites (between 0
and 1
)
default 5
.
If doRuleB = TRUE
,
minimum number of sites in which each species has relative abundance
>= rule.B1_percentage
default 0.5
.
If doRuleB = TRUE
, minimum
average relative abundance required for each species (between 0
and
1
)
default FALSE
.
If TRUE
, selection is done
including constraints on number of occurrences at the habitat level (with
the values of rule.A1
and rule.A2_quantile
)
(optional) default FALSE
.
If TRUE
, selection is also done on subsets of
mat.observations
, keeping only a percentage of releves or sites, to
visualize the robustness of the selection
(optional) default c(0.1, 0.2,
0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9)
.
If opt.doRobustness = TRUE
,
vector
containing values between 0
and 1
corresponding
to the percentages with which to build subsets to evaluate robustness
(optional) default 10
.
If
opt.doRobustness = TRUE
, number of repetitions for each percentage
value defined by opt.robustness_percent
to evaluate robustness
(optional) default TRUE
.
If TRUE
, building of abundances / occurrences tables for selected
species will be processed, saved and returned.
(optional) default TRUE
.
If TRUE
,
plot(s) will be processed, otherwise only the calculation and reorganization
of outputs will occur, be saved and returned.
A list
containing one vector
, four or five
data.frame
objects with the following columns, and up to five
ggplot2
objects :
the names of the selected species
A1,A2,B1,B2, hab
if the rule has been used, if the species fullfills this condition or not
species
the concerned species
SELECTION
the summary of rules with which the species was selected, or not
SELECTED
TRUE
if the species fullfills A1
and at least one other condition, FALSE
otherwise
...
same as tab.rules
type
the type of subset (either releves
or
sites
)
percent
the concerned percentage of values extraction
rep
the repetition ID
table containing sums of abundances for all selected species (sites in rows, species in columns)
table containing counts of presences for all selected species (sites in rows, species in columns)
ggplot2
object, representing the selection of
species according to rules A1 and A2
ggplot2
object, representing the selection of
species according to rules B
ggplot2
object, representing the selection of
species according to rules C (A1 and A2 per habitat)
ggplot2
object, representing selected species with
Principal Coordinates Analysis (see dudi.pco
)
ggplot2
object, representing the robustness
of the selection of species for each rule
The information is written in
PRE_FATE_DOMINANT_[...].csv
files :
TABLE_complete
the complete table of all species and the
selection rules described above (tab.rules
)
TABLE_species
only the names / ID of the species selected
TABLE_sitesXspecies_AB
abundances table of selected species
TABLE_sitesXspecies_PA
presence/absence table of selected species
Up to six PRE_FATE_DOMINANT_[...].pdf
files are also created :
STEP_1_rule_A
STEP_2_selectedSpecies_PHYLO
STEP_1_rule_B
STEP_2_selectedSpecies_PCO
STEP_1_rule_C
STEP_2_selectedSpecies_robustness
This function provides a way to select dominant species based on
presence/abundance sampling information.
Three rules can be applied to make the species selection :
both conditions must be fullfilled
the species should be found a minimum
number of times (rule.A1
)
This should ensure that the species has been given sufficient
minimum sampling effort. This criterion MUST ALWAYS be fullfilled.
the species should be found in a certain
number of sites, which corresponds to the quantile
rule.A2_quantile
of the total number of records per species
This should ensure that the species is covering all the
studied area (or at least a determining part of it, assuming that
the releves are well distributed throughout the area).
at least one of the two conditions is required
the species should be dominant (i.e. represent at
least rule.B1_percentage %
of the coverage of the site) in at
least rule.B1_number
sites
This should ensure the selection of species frequently
abundant.
the species should have a mean relative
abundance superior or equal to rule.B2
This should ensure the selection of species not frequent but
representative of the sites in which it is found.
If habitat information is
available (e.g. type of environment : urban, desert, grassland... ; type
of vegetation : shrubs, forest, alpine grasslands... ; etc), the same
rules than A can be applied but for each habitat.
This should help to keep species that are not dominant at the
large scale but could be representative of a specific habitat.
A table is created containing for each species whether or not it fullfills
the conditions selected, for example :
| ___A1 ___A2 ___B1 ___B2 grass lands |
_______________________________________
| _TRUE FALSE FALSE _TRUE _TRUE FALSE |
species a | _TRUE _TRUE _TRUE FALSE FALSE FALSE |
species b | FALSE FALSE FALSE FALSE FALSE _TRUE |
species c
This table is transformed into Euclidean distance matrix (with
gowdis
and quasieuclid
functions)
to cluster and represent species (see
.pdf
output files) :
through phylogenetic tree (with hclust
and
as.phylo
functions)
through Principal Component Analysis (with
dudi.pco
)
according to their selection rules :
A2 : spatial dominancy (widespread but poorly abundant)
B1 : local dominancy (relatively abundant or dominant in a certain number of sites)
B2 : local dominancy (not widespread but dominant in few sites)
C : habitat dominancy (not widespread but dominant in a specific habitat)
A2 & B1 : (widespread and relatively abundant)
A2 & B2 : (widespread and dominant in few sites)
A2 & B1 & B2 : (widespread and dominant)
B1 & B2 : (relatively widespread but dominant)
NB :
Species not meeting any criteria or only A1 are considered as
"Not selected".
Priority is set to A2, B1 and B2 rules, rather
than C. Hence, species selected according to A2, B1 and/or B2 can also meet
criterion C while species selected according to C do not meet any of the
three criteria.
Species selected according to one (or more) criterion
but not meeting criterion A1 are also considered as "Not selected".
## Load example data
Champsaur_PFG = .loadData('Champsaur_PFG', 'RData')
## Species observations
tab = Champsaur_PFG$sp.observations
## No habitat, no robustness -------------------------------------------------
tab.occ = tab[, c('sites', 'species', 'abund')]
sp.SELECT = PRE_FATE.selectDominant(mat.observations = tab.occ)
names(sp.SELECT)
str(sp.SELECT$tab.rules)
str(sp.SELECT$tab.dom.PA)
plot(sp.SELECT$plot.A)
plot(sp.SELECT$plot.B$abs)
plot(sp.SELECT$plot.B$rel)
## Habitat, change parameters, no robustness (!quite long!) --------------------
if (FALSE) {
tab.occ = tab[, c('sites', 'species', 'abund', 'habitat')]
sp.SELECT = PRE_FATE.selectDominant(mat.observations = tab.occ
, doRuleA = TRUE
, rule.A1 = 10
, rule.A2_quantile = 0.9
, doRuleB = TRUE
, rule.B1_percentage = 0.2
, rule.B1_number = 10
, rule.B2 = 0.4
, doRuleC = TRUE)
names(sp.SELECT)
str(sp.SELECT$tab.rules)
plot(sp.SELECT$plot.C)
plot(sp.SELECT$plot.pco$Axis1_Axis2)
plot(sp.SELECT$plot.pco$Axis1_Axis3)
}
## No habitat, robustness (!quite long!) --------------------
if (FALSE) {
tab.occ = tab[, c('sites', 'species', 'abund')]
sp.SELECT = PRE_FATE.selectDominant(mat.observations = tab.occ
, opt.doSitesSpecies = FALSE
, opt.doRobustness = TRUE
, opt.robustness_percent = seq(0.1,0.9,0.1)
, opt.robustness_rep = 10)
names(sp.SELECT)
str(sp.SELECT$tab.robustness)
names(sp.SELECT$plot.robustness)
plot(sp.SELECT$plot.robustness$`All dataset`)
}