Perform automated likelihood, expansion, and series selection for a classic distance sampling analysis. Estimate abundance using the best fitting likelihood, expansion, and series.
autoDistSamp(
data,
formula,
likelihoods = c("halfnorm", "hazrate", "negexp"),
w.lo = units::set_units(0, "m"),
w.hi = NULL,
expansions = 0:3,
series = c("cosine"),
x.scl = w.lo,
g.x.scl = 1,
warn = TRUE,
outputUnits = NULL,
area = NULL,
propUnitSurveyed = 1,
ci = 0.95,
R = 500,
plot.bs = FALSE,
showProgress = TRUE,
plot = TRUE,
criterion = "AICc"
)
An Rdistance 'abundance estimate' object, which is a list of
class c("abund", "dfunc")
, containing all the components of a "dfunc"
object (see dfuncEstim
), plus the following:
A tibble containing fitted coefficients in the distance function, density in the area(s) surveyed, abundance on the study area, the number of groups seen between w.lo and w.hi, the number of individuals seen between w.lo and w.hi, study area size, surveyed area, average group size, and average effective detection distance.
If confidence intervals were requested, a tibble
containing all bootstrap values of coefficients,
density, abundance, groups seen, individuals seen,
study area size, surveyed area size, average group size,
and average effective detection distance. The number of rows is always
R
, the requested number of bootstrap
iterations. If an iteration fails, the
corresponding row in B
is NA
(hence, use 'na.rm = TRUE'
when computing summaries). Columns 1 through length(coef(dfunc))
contain bootstrap realizations of the distance function's coefficients.
Confidence level of the confidence intervals
An RdistDf
data frame. RdistDf
data frames
contain one line per transect and a list-based column. The list-based
column contains a data frame with detection information.
The detection information data frame on each row contains (at least) distances
and group sizes of all targets detected on the transect.
Function RdistDf
creates RdistDf
data frames
from separate transect and detection data frames.
is.RdistDf
checks whether data frames
are RdistDf
's.
A standard formula object. For example, dist ~ 1
,
dist ~ covar1 + covar2
). The left-hand side (before ~
)
is the name of the vector containing off-transect or radial detection distances.
The right-hand side contains the names of covariate
vectors to fit in the detection
function, and potentially group sizes.
Covariates can be either detection level
or transect level and can appear in data
or exist in the
global working environment. Regular R scoping
rules apply.
String vector specifying the likelihoods to fit.
See 'likelihood' parameter of dfuncEstim
.
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
w.lo
is greater than 0, it must be assigned measurement units
using units(w.lo) <- "<units>"
or
w.lo <- units::set_units(w.lo, "<units>")
.
See examples in the help for set_units
.
Upper or right-truncation limit of the distances
in dist
. This is the maximum off-transect distance that
could be observed. If unspecified (i.e., NULL),
right-truncation is set to the maximum of the observed
distances. If w.hi
is specified, it must have associated
measurement units. Assign measurement units
using units(w.hi) <- "<units>"
or
w.hi <- units::set_units(w.hi, "<units>")
.
See examples in the help for set_units
.
A scalar specifying the number of terms
in series
to compute. Depending on the series,
this could be 0 through 5. The default of 0 equates
to no expansion terms of any type. No expansion terms
are allowed (i.e., expansions
is forced to 0) if
covariates are present in the detection function
(i.e., right-hand side of formula
includes
something other than 1
).
If expansions
> 0, this string
specifies the type of expansion to use. Valid values at
present are 'simple', 'hermite', and 'cosine'.
The x coordinate (a distance) at which the
detection function will be scaled. g.x.scl
can be a distance
or the string "max".
When x.scl
is specified (i.e., not 0 or "max"), it must have measurement
units assigned using either library(units);units(x.scl) <- '<units>'
or x.scl <- units::set_units(x.scl, <units>)
. See
units::valid_udunits()
for valid symbolic units.
Height of the distance function at coordinate x
.
The distance function
will be scaled so that g(x.scl
) = g.x.scl
.
If g.x.scl
is not
a data frame, it must be a numeric value (vector of length 1)
between 0 and 1.
A logical scalar specifying whether to issue
an R warning if the estimation did not converge or if one
or more parameter estimates are at their boundaries.
For estimation, warn
should generally be left at
its default value of TRUE
. When computing bootstrap
confidence intervals, setting warn = FALSE
turns off annoying warnings when an iteration does
not converge. Regardless of warn
, after
completion all messages about
convergence and boundary conditions are printed
by print.dfunc
, print.abund
, and
plot.dfunc
.
A string specifying the symbolic measurement
units for results. Valid units are listed in units::valid_udunits()
.
The strings for common distance symbolic units are:
"m" - meters, "ft" - feet, "cm" - centimeters, "mm" -
millimeters, "mi" - miles, "nmile" -
nautical miles ("nm" is nano meters), "in" - inches,
"yd" - yards, "km" - kilometers, "fathom" - fathoms,
"chains" - chains, and "furlong" - furlongs.
If outputUnits
is unspecified (NULL),
output units will be the same as those on
distances in data
.
A scalar containing the total area of inference. Usually, this is
study area size. If area
is NULL (the default),
area
will be set to 1 square unit of the output units and density estimates
will be produced.
If area
is not NULL, it must have measurement units
assigned by the units
package.
The units on area
must be convertible
to squared output units. Units
on area
must be two-dimensional.
For example, if output units are "foo",
units on area must be convertible to "foo^2" by the units
package. Units of "km^2", "cm^2", "ha", "m^2", "acre", "mi^2", and several
others are acceptable.
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
propUnitSurveyed
= 1. If only a single side of line transects were observed, set
propUnitSurveyed
= 0.5. For point transects, this should be set to
the proportion of each circle that was observed. Length must either be
1 or the total number of transects in x
.
A scalar indicating the confidence level of confidence intervals.
Confidence intervals are computed using a bias corrected bootstrap
method. If ci = NULL
or ci == NA
, confidence intervals
are not computed.
The number of bootstrap iterations to conduct when ci
is not
NULL.
A logical scalar indicating whether to plot individual bootstrap iterations.
A logical indicating whether to show a text-based
progress bar during bootstrapping. Default is TRUE
.
It is handy to shut off the
progress bar if running this within another function. Otherwise,
it is handy to see progress of the bootstrap iterations.
Logical scalar specifying whether to plot models during model selection.
If TRUE
, a histogram with fitted distance function is plotted for every model.
The function pauses between each plot and prompts the user for whether they want to continue.
To suppress user prompts, set plot
= FALSE
.
A string specifying the criterion to use when assessing model fit.
The best fitting model, as defined by this routine, has the lowest value
of this criterion. This must be one of "AICc" (the default),
"AIC", or "BIC". See AIC.dfunc
for formulas.
During distance function selection, all combinations of likelihoods, series, and
number of expansions is fitted. For example, if likelihoods
has 3 elements,
series
has 2 elements, and expansions
has 4 elements,
this routine fits a total of 3 (likelihoods) * 2 (series) * 4 (expansions)
= 24 models. Default parameters fit 9 detection functions, i.e.,
all combinations of "halfnorm", "hazrate", and "negexp" likelihoods
and 0 through 3 expansions. Other combinations are specified through
values of likelihoods
, series
, and expansions
.
Suppress all intermediate output using plot.bs=FALSE
,
showProgress=FALSE
, and plot=FALSE
.
The returned abundance estimate object contains
an additional component, the fitting table (a list of models fitted and
criterion values) in component $fitTable
.
dfuncEstim
, abundEstim
.
# Load example sparrow data (line transect survey type)
data(sparrowDf)
autoDistSamp(data = sparrowDf
, formula = dist ~ groupsize(groupsize)
, likelihoods = c("halfnorm","negexp")
, expansions = 0
, plot = FALSE
, ci = NULL
, area = units::set_units(1, "hectare")
)
if (FALSE) {
autoDistSamp(data = sparrowDf
, formula = dist ~ 1 + groupsize(groupsize)
, ci = 0.95
, area = units::set_units(1, "hectare")
)
}
Run the code above in your browser using DataLab