What we did so far was defining the relationship between the species and its environment. It was the most important part, but what it provides us is a spatial distribution of the species’ environmental suitability, not a spatial distribution of the species’ presences/absences. Hence we now have to convert the environmental suitability into presence-absence.
The simplest approach, which was also the most widely used until 2014, consists in defining a threshold of environmental suitability, below which conditions are unsuitable, so absence is attributed; and above which conditions are suitable, so presence is attributed. However, you should avoid this approach which is inadequate (see Meynard et al. 2019 for a review):
So, how can we convert our environmental suitability into presence-absence without a threshold?
We use a probabilistic approach: the probability of getting a presence of the species in a given pixel is dependent on its suitability in that pixel. With the package, the probability of presence can be defined as a linear transformation of environmental suitability, or as a logistic transformation of environmental suitability.
## Le chargement a nécessité le package : terra
## terra 1.7.46
## The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
## which was just loaded, will retire in October 2023.
## Please refer to R-spatial evolution reports for details, especially
## https://r-spatial.org/r/2023/05/15/evolution4.html.
## It may be desirable to make the sf package available;
## package maintainers should consider adding sf to Suggests:.
## The sp package is now running under evolution status 2
## (status 2 uses the sf package in place of rgdal)
In the example above, two linear conversions are illustrated (left panel), and one logistic conversion (right panel). The simplest method is to use environmental suitability as a probability of occurrence (left panel, plain line). In this case, a pixel with environmental suitability equal to 0.6 has 60% chance of having species presence, and 40% chance of having species absence. A linear conversion with a slope different from 1 may be chosen, e.g. to change the species prevalence (left panel, dashed line). In this example (slope = 0.5, intercept = 0), a pixel with environmental suitability equal to 0.6 has 30% chance of having species presence, and 70% chance of having species absence.
A logistic function on the other hand, will ensure that the simulated probability is within the 0-1 range and allow easycontrol of species prevalence. However, the logistic function will also flatten out the relationship at the extreme suitability values, and narrow or broaden the intermediate probability values depending on the slope of the logistic curve. In the example above (right panel), a pixel with environmental suitability equal to 0.6 has 88% chance of having species presence, and 12% chance of having species absence.
This means that we convert the environmental suitability of each pixel into a probability of occurrence. This probability of occurrence is then used to sample presence or absence in each cell, i.e., we make a random draw of presence or absence weighted by the probability of occurrence. As a consequence, repeated realisation of the presence-absence conversion will produce different occupancy maps, each providing a valid realisation of the true species distribution map.
The conversion is fully customisable, and can range from threshold conversion to logistic to linear conversion, by adjusting parameters (explained in section 4.2):
The logistic & linear conversions look much more realistic than the threshold-like conversion.
In the next section we will see how to perform the conversion, and how to customise it.
At first you may be interested to see the function in action, before we try to customise it. Here it is!
# Let's use the same species we generated before:
my.parameters <- formatFunctions(bio1 = c(fun = 'dnorm',
mean = 25, sd = 5),
bio12 = c(fun = 'dnorm',
mean = 4000, sd = 2000))
my.first.species <- generateSpFromFun(raster.stack = worldclim[[c("bio1", "bio12")]],
parameters = my.parameters,
plot = FALSE)
# Run the conversion to presence-absence
pa1 <- convertToPA(my.first.species, plot = TRUE)
You have probably noticed that the conversion was performed without you choosing any parameter. This is because by default, the function performs a logistic conversion with random parameters. It would probably be better if you defined them yourself!
In the next section I summarise how to choose appropriate values of parameters, and then I illustrate how you can ask for a specific species prevalence (i.e., the number of places occupied by the species, out of the total number of available places) and let the function decide the parameters for you.
This conversion can be selected by defining argument
PA.method = "threshold"
. The only parameter that can be
customised in this kind of conversion is beta
which
determines the threshold of suitability above which the species is
present and below which the species is absent.
pa2 <- convertToPA(my.first.species, PA.method = "threshold", beta = 0.65)
## Threshold conversion finished:
##
## - cutoff = 0.65
## - species prevalence =0.0297987879507904
This conversion can be selected by defining arguments
PA.method = "probability"
(default value) and
prob.method = "linear"
. Two arguments can be customised:
the slope a
and intercept b
of the curve.
Here is a first example where the probability of occurrence is equal to environmental suitability:
pa3 <- convertToPA(my.first.species, PA.method = "probability",
prob.method = "linear", a = 1, b = 0)
## --- Determing species prevalence automatically based
## on a linear transformation of environmental suitability of
## slope a = 1 and intercept b = 0
## Linear conversion finished:
##
## - slope (a) = 1
## - intercept (b) = 0
## - species prevalence =0.0866428315964423
A second example where we change the probability of occurrence, which results in a difference in terms of species prevalence:
pa4 <- convertToPA(my.first.species, PA.method = "probability",
prob.method = "linear", a = 0.5, b = 0)
## --- Determing species prevalence automatically based
## on a linear transformation of environmental suitability of
## slope a = 0.5 and intercept b = 0
## Linear conversion finished:
##
## - slope (a) = 0.5
## - intercept (b) = 0
## - species prevalence =0.0432558260411136
You can see the shape of the conversion with
plotSuitabilityToProba
:
plotSuitabilityToProba(pa3)
plotSuitabilityToProba(pa4, add = TRUE, lty = 2)
This conversion can be selected by defining arguments
PA.method = "probability"
(default value) and
prob.method = "logistic"
(default value). The parameters
\(\alpha\) (alpha
) and
\(\beta\) (beta
)determine
the shape of the logistic curve:
The parameters are fairly simple to customise:
pa5 <- convertToPA(my.first.species,
PA.method = "probability",
prob.method = "logistic",
beta = 0.65, alpha = -0.07,
plot = TRUE)
# You can modify the conversion of your object if you did not like it:
pa5 <- convertToPA(pa5,
beta = 0.3, alpha = -0.02,
plot = TRUE)
plotSuitabilityToProba(pa5)
beta
is very simple to grasp, as it is the inflexion
point of the curve. Hence, looking at the three beta curves above, we
can see that a lower beta
will increase the probability of
finding suitable conditions for the species (wider distribution range).
A higher beta
will decrease the probability of finding
suitable conditions (smaller distribution range).
alpha
may be more difficult to grasp, as it is dependent
on the range of values of the x
axis (in our case, the
environmental suitability, ranging from 0 to 1):
alpha
is approximately equal to the range
ofx
or greater (in absolute value), then the conversion
will have a linear shape. In our case, it meansalpha
below -.3).alpha
is about 5-10% of the range of x
,
then the conversion will be logistic. In our case, you can have nice
logistic curves for values of alpha
between -0.1 and
-0.01.alpha
is much smaller compared to x
(in
absolute value), then the conversion will be threshold-like. In our
case, if means values of alpha
in the range [-0.001,
0[.The species prevalence is the number of places (here, pixels) actually occupied by the species out of the total number of places (pixels) available. Do not confuse the SPECIES PREVALENCE with the SAMPLE PREVALENCE, which in turn is the proportion of samples in which you have found the species.
Numerous authors have shown the importance of the species prevalence in species distribution modelling, and how it can bias models. As a consequence, when generating virtual species distributions to test particular protocols or modelling techniques, you may be interested in testing different values of species prevalence. However, it is important to know that the species prevalence is dependent on the species-environment relationship, the shape of the probabilistic conversion curve AND the spatial distribution of environmental conditions. As a consequence, the function has to try different shapes of conversion curve to find a conversion according to your chosen value of species prevalence. Sometimes, it is not possible to reach a particular species prevalence, in that case the function will choose the conversion curve providing results closest to your species prevalence.
For all three types of conversion (threshold, linear, logistic), the function can automatically find parameters according to your chosen value of species prevalence.
In this case you just specify your value of species prevalence without any other parameter:
# Let's generate a species occupying 20% of the world
sp.0.2 <- convertToPA(my.first.species,
PA.method = "threshold",
species.prevalence = 0.2)
## Threshold conversion finished:
##
## - cutoff = 0.1362433482644
## - species prevalence =0.200000495017035
sp.0.2
## Virtual species generated from 2 variables:
## bio1, bio12
##
## - Approach used: Responses to each variable
## - Response functions:
## .bio1 [min=-54.72435; max=30.98764] : dnorm (mean=25; sd=5)
## .bio12 [min=0; max=11191] : dnorm (mean=4000; sd=2000)
## - Each response function was rescaled between 0 and 1
## - Environmental suitability formula = bio1 * bio12
## - Environmental suitability was rescaled between 0 and 1
##
## - Converted into presence-absence:
## .Method = threshold
## .threshold = 0.1362433482644
## .species prevalence = 0.200000495017035
In this case again, you just specify your value of species prevalence without any other parameter, and the function will try to find a conversion that respects the chosen prevalence, and does not result in probabilities below 0 or above 1.
# Let's generate a species occupying 20% of the world
sp.0.2 <- convertToPA(my.first.species,
PA.method = "probability",
prob.method = "linear",
species.prevalence = 0.2)
## --- Searching for a linear transformation of
## environmental suitability that fits the chosen
## species.prevalence.
## Linear conversion finished:
##
## - slope (a) = 0.876382668991938
## - intercept (b) = 0.123617331008062
## - species prevalence =0.200226965310444
sp.0.2
## Virtual species generated from 2 variables:
## bio1, bio12
##
## - Approach used: Responses to each variable
## - Response functions:
## .bio1 [min=-54.72435; max=30.98764] : dnorm (mean=25; sd=5)
## .bio12 [min=0; max=11191] : dnorm (mean=4000; sd=2000)
## - Each response function was rescaled between 0 and 1
## - Environmental suitability formula = bio1 * bio12
## - Environmental suitability was rescaled between 0 and 1
##
## - Converted into presence-absence:
## .Method = probability
## .probabilistic method = linear
## .a (slope) = 0.876382668991938
## .b (intercept) = 0.123617331008062
## .species prevalence = 0.200226965310444
# And two others covering 5 and 50% of the world
sp.0.05 <- convertToPA(my.first.species,
PA.method = "probability",
prob.method = "linear",
species.prevalence = 0.05)
## --- Searching for a linear transformation of
## environmental suitability that fits the chosen
## species.prevalence.
## Linear conversion finished:
##
## - slope (a) = 0.573679003730832
## - intercept (b) = 0
## - species prevalence =0.0499855826288622
sp.0.5 <- convertToPA(my.first.species,
PA.method = "probability",
prob.method = "linear",
species.prevalence = 0.5)
## --- Searching for a linear transformation of
## environmental suitability that fits the chosen
## species.prevalence.
## Linear conversion finished:
##
## - slope (a) = 0.547739168119961
## - intercept (b) = 0.452260831880039
## - species prevalence =0.499890477481056
plotSuitabilityToProba(sp.0.05)
plotSuitabilityToProba(sp.0.2, add = TRUE, lty = 2)
plotSuitabilityToProba(sp.0.5, add = TRUE, lty = 3)
legend("topleft", lty = 1:3, legend = c(0.05, 0.2, 0.5), title = "Prevalence")
For the logistic conversion, you have to fix either
alpha
or beta
. I strongly advise to fix a
value of alpha
(this is the default parameter, with
alpha = -0.05
), so that the function will try to find an
appropriate conversion by testing different values of beta
.
If you prefer to fix the value of beta
, then the function
will try different values of alpha
, but it is likely that
it will not be able to find a conversion generating a species with the
appropriate prevalence.
Let’s see it in practice:
# Let's generate a species occupying 20% of the world
# Default settings, alpha = -0.05
sp.0.2 <- convertToPA(my.first.species,
species.prevalence = 0.2)
## --- Determing beta automatically according to alpha and species.prevalence
## Logistic conversion finished:
##
## - beta = 0.17578125
## - alpha = -0.05
## - species prevalence =0.200578427405133
sp.0.2
## Virtual species generated from 2 variables:
## bio1, bio12
##
## - Approach used: Responses to each variable
## - Response functions:
## .bio1 [min=-54.72435; max=30.98764] : dnorm (mean=25; sd=5)
## .bio12 [min=0; max=11191] : dnorm (mean=4000; sd=2000)
## - Each response function was rescaled between 0 and 1
## - Environmental suitability formula = bio1 * bio12
## - Environmental suitability was rescaled between 0 and 1
##
## - Converted into presence-absence:
## .Method = probability
## .probabilistic method = logistic
## .alpha (slope) = -0.05
## .beta (inflexion point) = 0.17578125
## .species prevalence = 0.200578427405133
# Now, a species occupying 1.5% of the world
# Change alpha to have a slightly more steep curve
sp.0.015 <- convertToPA(my.first.species,
species.prevalence = 0.015,
alpha = -0.015)
## --- Determing beta automatically according to alpha and species.prevalence
## Logistic conversion finished:
##
## - beta = 0.78125
## - alpha = -0.015
## - species prevalence =0.0145052366614566
sp.0.015
## Virtual species generated from 2 variables:
## bio1, bio12
##
## - Approach used: Responses to each variable
## - Response functions:
## .bio1 [min=-54.72435; max=30.98764] : dnorm (mean=25; sd=5)
## .bio12 [min=0; max=11191] : dnorm (mean=4000; sd=2000)
## - Each response function was rescaled between 0 and 1
## - Environmental suitability formula = bio1 * bio12
## - Environmental suitability was rescaled between 0 and 1
##
## - Converted into presence-absence:
## .Method = probability
## .probabilistic method = logistic
## .alpha (slope) = -0.015
## .beta (inflexion point) = 0.78125
## .species prevalence = 0.0145052366614566
# Let's try by fixing beta rather than alpha
# We want a species occupying 10% of the world, with a high value of beta
sp.10 <- convertToPA(my.first.species,
species.prevalence = 0.1,
alpha = NULL,
beta = 0.9)
## --- Determing alpha automatically according to beta and species.prevalence
## Logistic conversion finished:
##
## - beta = 0.9
## - alpha = -0.35252734375
## - species prevalence =0.100947586358816
sp.10
## Virtual species generated from 2 variables:
## bio1, bio12
##
## - Approach used: Responses to each variable
## - Response functions:
## .bio1 [min=-54.72435; max=30.98764] : dnorm (mean=25; sd=5)
## .bio12 [min=0; max=11191] : dnorm (mean=4000; sd=2000)
## - Each response function was rescaled between 0 and 1
## - Environmental suitability formula = bio1 * bio12
## - Environmental suitability was rescaled between 0 and 1
##
## - Converted into presence-absence:
## .Method = probability
## .probabilistic method = logistic
## .alpha (slope) = -0.35252734375
## .beta (inflexion point) = 0.9
## .species prevalence = 0.100947586358816
It worked, but the resulting species does not look realistic at all: alpha was below -0.3, which means that we had a quasi-linear conversion curve, producing this unrealistic presence-absence map.
# Now an impossible task: a low value of beta for the same requirements:
sp.10bis <- convertToPA(my.first.species,
species.prevalence = 0.1,
alpha = NULL,
beta = 0.3)
## --- Determing alpha automatically according to beta and species.prevalence
## Warning in convertToPA(my.first.species, species.prevalence = 0.1, alpha = NULL, : Warning, the desired species prevalence cannot be obtained, because of the chosen beta and available environmental conditions (see details).
## The closest possible estimate of prevalence was 0.11
## Perhaps you can try a higher beta value.
## Logistic conversion finished:
##
## - beta = 0.3
## - alpha = -0.001
## - species prevalence =0.108688415240089
sp.10bis
## Virtual species generated from 2 variables:
## bio1, bio12
##
## - Approach used: Responses to each variable
## - Response functions:
## .bio1 [min=-54.72435; max=30.98764] : dnorm (mean=25; sd=5)
## .bio12 [min=0; max=11191] : dnorm (mean=4000; sd=2000)
## - Each response function was rescaled between 0 and 1
## - Environmental suitability formula = bio1 * bio12
## - Environmental suitability was rescaled between 0 and 1
##
## - Converted into presence-absence:
## .Method = probability
## .probabilistic method = logistic
## .alpha (slope) = -0.001
## .beta (inflexion point) = 0.3
## .species prevalence = 0.108688415240089
Let’s see how the conversion curves look like:
plotSuitabilityToProba(sp.0.015)
plotSuitabilityToProba(sp.0.2, add = TRUE, lty = 2)
plotSuitabilityToProba(sp.10, add = TRUE, lty = 3)
plotSuitabilityToProba(sp.10bis, add = TRUE, lty = 4)
legend("topleft", lty = 1:4, legend = c(0.015, 0.2, 0.1, "0.1 (failed)"), title = "Prevalence")
-----------------
Do not hesitate if you have a question, find a bug, or would like to add a feature in virtualspecies: mail me!