Estimate abundance (or density) given an estimated detection function and supplemental information on observed group sizes, transect lengths, area surveyed, etc. Also computes confidence intervals on abundance (or density) using a the bias corrected bootstrap method.
abundEstim(
dfunc,
detectionData,
siteData,
area = NULL,
singleSided = FALSE,
ci = 0.95,
R = 500,
lengthColumn = "length",
plot.bs = FALSE,
showProgress = TRUE,
control = RdistanceControls()
)
An 'abundance estimate' object, which is a list of
class c("abund", "dfunc")
, containing all the components of a "dfunc"
object (see dfuncEstim
), plus the following:
Estimated density on the sampled area with units. The effectively
sampled area is 2*L*ESW (not 2*L*w.hi). Density has squared units of the
requested output units. Convert density to other units with
units::set_units(x$density, "<units>").
Estimated abundance on the study area (if area
>
1) or estimated density on the study area (if area
= 1), without units.
The number of detections (not individuals, unless all group sizes = 1) on non-NA length transects used to compute density and abundance.
The total number of individuals seen on transects with non-NA length. Sum of group sizes used to estimate density and abundance.
Total area of inference in squared output units.
The total length of sampled transect with units. This is the sum
of the lengthColumn
column of siteData
.
Average group size on transects with non-NA length transects.
Minimum and maximum groupsizes observed on non-NA length transects.
A vector containing effective sample distance. If covariates
are not included, length of this vector is 1 because effective sampling distance
is constant over detections. If covariates are included, this vector has length
equal to the number of detections (i.e., x$n
). This vector was produced
by a call to effectiveDistance()
with newdata
set to NULL.
A vector containing the lower and upper limits of the bias corrected bootstrap confidence interval for abundance.
A vector containing the lower and upper limits of the bias corrected bootstrap confidence interval for density, with units.
A vector containing the lower and upper limits of the bias corrected bootstrap confidence interval for average effective sampling distance.
A data frame containing bootstrap values of coefficients,
density, and effective distances. Number of rows is always
R
, the requested number of bootstrap
iterations. If a particular iteration did not converge, the
corresponding row in B
is NA
(hence, use 'na.rm = TRUE'
when computing summaries). Columns 1 through length(coef(dfunc))
contain bootstrap realizations of the distance function's coefficients.
The second to last column contains bootstrap values of
density (with units). The last column of B contains bootstrap
values of effective sampling distance or radius (with units). If the
distance function contains covariates,
the effective sampling distance column is the average
effective distance over detections
used during the associated bootstrap iteration.
The number of bootstrap iterations that converged.
The (scalar) confidence level of the
confidence interval for n.hat
.
An estimated 'dfunc' object produced by dfuncEstim
.
A data frame containing detection distances (either perpendicular for line-transect or radial for point-transect designs), with one row per detected object or group. This data frame must contain at least the following information:
Detection Distances: A single column containing
detection distances must be specified on the left-hand
side of formula
. As of Rdistance version 3.0.0,
the detection distances must have measurement units attached.
Attach measurements units to distances using library(units);units()<-
.
For example, library(units)
followed by units(df$dist) <- "m"
or
units(df$dist) <- "ft"
will work. Alternatively,
df$dist <- units::set_units(df$dist, "m")
also works.
Site IDs: The ID of the transect or point
(i.e., the 'site') where each object or group was detected.
The site ID column(s) (see arguments transectID
and
pointID
) must
specify the site (transect or point) so that this
data frame can be merged with siteData
.
In a later release, Rdistance
will allow detection-level
covariates. When that happens, detection-level
covariates will appear in this data frame.
See example data set sparrowDetectionData
.
See also Input data frames below
for information on when detectionData
and
siteData
are required inputs.
A data.frame containing site (transect or point)
IDs and any
site level covariates to include in the detection function.
Every unique surveyed site (transect or point) is represented on
one row of this data set, whether or not targets were sighted
at the site. See arguments transectID
and
pointID
for an explanation of the way in which distance and site
data frames are merged. See
section Relationship between data frames (transect and point ID's)
for additional details.
See Data frame requirements for situations in which
detectionData
only, detectionData
and siteData
, or
neither are required.
A scalar containing the total area of
inference. Commonly, this is study area size.
If area
is NULL (the default),
area
will be set to 1 square unit of the output units and this
produces abundance estimates equal density estimates.
If area
is not NULL, it must have measurement units
assigned by the units
package.
The units on area
must be convertible
to squared output units. Units
on area
must be two-dimensional.
For example, if output units are "foo",
units on area must be convertible to "foo^2" by the units
package.
Units of "km^2", "cm^2", "ha", "m^2", "acre", "mi^2", and many
others are acceptable.
Logical scaler. If only one side of the transect was
observed, set singleSided
= TRUE. If both sides of line-transects were
observed, singleSided
= FALSE. Some surveys
observe only one side of transect lines for a variety of logistical reasons.
For example, some aerial line-transect surveys place observers on only one
side of the aircraft. This parameter effects only line-transects. When
singleSided
= TRUE, surveyed area is halved and the density
estimator's denominator (see Details)
is \((ESW)(L)\), not \(2(ESW)(L)\).
A scalar indicating the confidence level of confidence intervals.
Confidence intervals are computed using a bias corrected bootstrap
method. If ci = NULL
, confidence intervals are not computed.
The number of bootstrap iterations to conduct when ci
is not
NULL.
Character string specifying the (single) column in
siteData
that contains transect lengths. This is ignored if
pointSurvey
= TRUE. This column must have measurement units.
A logical scalar indicating whether to plot individual bootstrap iterations.
A logical indicating whether to show a text-based
progress bar during bootstrapping. Default is TRUE
.
It is handy to shut off the
progress bar if running this within another function. Otherwise,
it is handy to see progress of the bootstrap iterations.
A list containing optimization control parameters such
as the maximum number of iterations, tolerance, the optimizer to use,
etc. See the
RdistanceControls
function for explanation of each value,
the defaults, and the requirements for this list.
See examples below for how to change controls.
The bootstrap confidence interval for abundance
assumes that the fundamental units of
replication (lines or points, hereafter "sites") are independent.
The bias corrected bootstrap
method used here resamples the units of replication (sites),
refits the distance function, and estimates abundance using
the resampled counts and re-estimated distance function.
The original data frames, detectionData
and siteData
,
are needed here for bootstrapping because they contain the transect
and detection information.
If a double-observer data
frame is included in dfunc
, rows of the double-observer data frame
are re-sampled each bootstrap iteration.
This routine does not re-select the distance model fitted to resampled data. The model in the input object is re-fitted every iteration.
By default, R
= 500 iterations are performed, after which the bias
corrected confidence intervals are computed (Manly, 1997, section 3.4).
During bootstrap iterations, the distance function can fail
to converge on the resampled data. An iteration can fail
to converge for a two reasons:
(1) no detections on the iteration, and (2) bad configuration
of distances on the iteration which pushes parameters to their
bounds. When an iteration fails to produce a valid
distance function, Rdistance
simply skips the intration, effectively ignoring these
non-convergent iterations.
If the proportion of non-convergent iterations is small
(less than 20
on abundance is
probably valid. If the proportion of non-convergent iterations
is not small (exceeds 20
The print method (print.abund
) is the routine that issues this
warning. The warning can be
turned off by setting maxBSFailPropForWarning
in the
print method to 1.0, or by modifying the code in RdistanceControls()
to re-set the default threshold and storing the modified
function in your .GlobalEnv
. Additional iterations may be needed
to achieve an adequate number. Check number of convergent iterations by
counting non-NA rows in output data frame 'B'.
Line transects: The transect length column of siteData
can contain missing values.
NA length transects are equivalent
to 0 [m] transects and do not count toward total surveyed units. NA length
transects are handy if some off-transect distance observations should be included
when estimating the distance function, but not when estimating abundance.
To do this, include the "extra" distance observations in the detection data frame, with valid
site IDs, but set the length of those site IDs to NA in the site data frame.
Group sizes associated with NA length transects are dropped and not counted toward density
or abundance. Among other things, this allows estimation of abundance on one
study area using off-transect distance observations from another.
Point transects: Point transects do not have length. The "length" of point transects
is the number of points on the transect. Rdistance
treats individual points as independent
and bootstrap resampmles them to estimate variance. To include distance obervations
from some points but not the number of targets seen, include a separate "length" column
in the site data frame with NA for the "extra" points. Like NA length line transects,
NA "length" point transects are dropped from the count of points and group sizes on these
transects are dropped from the counts of targets. This allows users to estimate their distance
function on one set of observations while inflating counts from another set of observations.
A transect "length" column is not required for point transects. Values in the lengthColumn
do not matter except for NA (e.g., a column of 1's mixed with NA's is acceptable).
The abundance estimate for line-transect surveys (if no covariates
are included in the detection function and both sides of the transect
were observed) is
$$N =\frac{n(A)}{2(ESW)(L)}$$
where n is total number of sighted individuals
(i.e., sum(dfunc$detections$groupSizes)
), L is the total length of
surveyed transect (i.e., sum(siteData[,lengthColumn])
),
and ESW is effective strip width
computed from the estimated distance function (i.e., ESW(dfunc)
).
If only one side of transects were observed, the "2" in the denominator
is not present (or, replaced with a "1").
The abundance estimate for point transect surveys (if no covariates are
included) is
$$N =\frac{n(A)}{\pi(ESR^2)(P)}$$
where n is total number of sighted individuals,
P is the total number of surveyed points,
and ESR is effective search radius
computed from the estimated distance function (i.e., ESR(dfunc)
).
Setting plot.bs=FALSE
and showProgress=FALSE
suppresses all intermediate output.
Manly, B.F.J. (1997) Randomization, bootstrap, and Monte-Carlo methods in biology, London: Chapman and Hall.
Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.
dfuncEstim
, autoDistSamp
.
# Load example sparrow data (line transect survey type)
data(sparrowDetectionData)
data(sparrowSiteData)
# Fit half-normal detection function
dfunc <- dfuncEstim(formula=dist ~ groupsize(groupsize)
, detectionData=sparrowDetectionData
, likelihood="halfnorm"
, w.hi=units::set_units(100, "m")
)
# Estimate abundance given a detection function
# No variance on density or abundance estimated here
# due to time constraints. Set ci=0.95 (or another value)
# to estimate bootstrap variances on ESW, density, and abundance.
fit <- abundEstim(dfunc
, detectionData = sparrowDetectionData
, siteData = sparrowSiteData
, area = units::set_units(4105, "km^2")
, ci = NULL
)
Run the code above in your browser using DataLab