plotFOP: Plot Frequency of Observed Presence (FOP).

Description

plotFOP produces a Frequency of Observed Presence (FOP) plot for a given explanatory variable. An FOP plot shows the rate of occurrence of the response variable across intervals or levels of the explanatory variable. For continuous variables, a local regression ("loess") of the FOP values is added to the plot as a line. Data density is plotted in the background (grey) to help visualize where FOP values are more or less certain.

Usage

plotFOP(
  data,
  EV,
  span = 0.5,
  intervals = NULL,
  ranging = FALSE,
  densitythreshold = NULL,
  ...
)

Value

In addition to the graphical output, a list of 2:

EVoptimum. The EV value (or level, for categorical EVs) at which FOP is highest
FOPdata. A data frame containing the plotted data. Columns in this data frame represent the following: EV interval ("int"), number of observations in the interval ("n"), mean EV value of the observations in the interval ("intEV"), mean RV value of the observations in the interval ("intRV"), and local regression predicted intRV ("loess"). For categorical variables, only the level name ("level"), the number of observations in the level ("n"), and the mean RV value of the level ("levelRV") are used.

Arguments

data

Data frame containing the response variable in the first column and explanatory variables in subsequent columns. The response variable should represent either presence and background (coded as 1/NA) or presence and absence (coded as 1/0). See Details for information regarding implications of occurrence data type. See also readData.

EV

Name or column index of the explanatory variable in data for which to calculate FOP.

span

The proportion of FOP points included in the local regression neighborhood. Should be between 0 and 1. Irrelevant for categorical EVs.

intervals

Number of intervals into which the continuous EV is divided. Defaults to the minimum of N/10 and 100. Irrelevant for categorical EVs.

ranging

Logical. If TRUE, will range the EV scale to [0,1]. This is equivalent to plotting FOP over the linear transformation produced by deriveVars. Irrelevant for categorical EVs.

densitythreshold

Numeric. Intervals containing fewer than this number of observations will be represented with an open symbol in the plot. Irrelevant for categorical EVs.

...

Arguments to be passed to plot or barplot to control the appearance of the plot. For example:

lwd for line width
cex.main for size of plot title
space for space between bars

Details

A list of the optimum EV value and a data frame containing the plotted data is returned invisibly. Store invisibly returned output by assigning it to an object.

In the local regression ("loess"), the plotted FOP values are regressed against their EV values. The points are weighted by the number of observations they represent, such that an FOP value from an interval with many observations is given more weight.

For continuous variables, the returned value of 'EVoptimum' is based on the loess-smoothed FOP values, such that a point maximum in FOP may not always be considered the optimal value of EV.

If the response variable in data represents presence/absence data, the result is an empirical frequency of presence curve, rather than a observed frequency of presence curve (see Støa et al. [2018], Sommerfeltia).

References

Støa, B., R. Halvorsen, S. Mazzoni, and V. I. Gusarov. (2018). Sampling bias in presence-only data used for species distribution modelling: theory and methods for detecting sample bias and its effects on models. Sommerfeltia 38:1–53.

Examples

Run this code

FOPev11 <- plotFOP(toydata_sp1po, 2)
FOPev12 <- plotFOP(toydata_sp1po, "EV12", intervals = 8)
FOPev12$EVoptimum
FOPev12$FOPdata

if (FALSE) {
# From vignette:
teraspifFOP <- plotFOP(grasslandPO, "teraspif")
terslpdgFOP <- plotFOP(grasslandPO, "terslpdg")
terslpdgFOP <- plotFOP(grasslandPO, "terslpdg", span = 0.75, intervals = 20)
terslpdgFOP
geobergFOP <- plotFOP(grasslandPO, 10)
geobergFOP
}

Run the code above in your browser using DataLab