unpmle: Unconditional NPML estimator for the SPECIES number

Description

This function calculate the unconditional NPML estimator of the species number by Norris and Pollock 1996, 1998. This estimator was obtained from the full likelihood based on a Poisson mixture model. The confidence interval is calculated based on a bootstrap procedure.

Usage

unpmle(n,t=15,C=0,method="W-L",b=200,conf=.95,seed=NULL,dis=1)

Value

The function unpmle returns a list of: Nhat, CI (if “C=1”)

Nhat: point estimate of N
CI: bootstrap confidence interval.

Arguments

n: a matrix or a numerical data frame of two columns. It is also called the “frequency of frequencies” data in literature. The first column is the frequency \(j=1, 2\ldots\); and the second column is \(n_j\), the number of species observed with \(j\) individuals in the sample.
t: a positive integer. t specifies the cutoff value to define the relatively less abundant species to be used in estimation. The default value for t=15. The estimator is fairly insensitive to the choice of t. The recommendation is to use \(t \ge 10\).
C: integer either 0 or 1. It specifies whether bootstrap confidence interval should be calculated. “C=1” for YES and “C=0” for NO.The default of C is set as 0.
method: string either “N-P” or “W-L”(default). If method=“N-P”, unconditional NPMLE will be used using an algorithm by Bonhing and Schon (2005). Sometimes this method can be extremely slow. Alternatively one can use method “W-L”, an approximate method (but with high precision and much faster) by Wang and Lindsay 2005.
b: integer. b specifies the number of bootstrap samples for confidence interval. It is ignored if “C=0”.
conf: a positive number \(\le 1\). conf specifies the confidence level for confidence interval. The default is 0.95.
seed: a single value, interpreted as an integer. Seed for random number generation
dis: 0 or 1. 1 for on-screen display of the mixture output, and 0 for none.

Author

Ji-Ping Wang, Department of Statistics, Northwestern University

Details

The computing is intensive if method=“N-P” is used particularly when extrapolation is large. It may takes hours to compute the bootstrap confidence interval. If method=“W-L” is used, computing usually is much much faster. Estimates from both methods are often identical.

References

Norris, J. L. I., and Pollock, K. H. (1996), Nonparametric MLE Under Two Closed Capture-Recapture Models With Heterogeneity, Biometrics, 52,639-649.

Norris, J. L. I., and Pollock, K. H.(1998), Non-Parametric MLE for Poisson Species Abundance Models Allowing for Heterogeneity Between Species, Environmental and Ecological Statistics, 5, 391-402.

Bonhing, D. and Schon, D., (2005), Nonparametric maximum likelihood estimation of population size based on the counting distribution, Journal of the Royal Statistical Society, Series C: Applied Statistics, 54, 721-737.

Wang, J.-P. Z. and Lindsay, B. G. ,(2005), A penalized nonparametric maximum likelihood approach to species richness estimation. Journal of American Statistical Association, 2005,100(471):942-959

Examples

Run this code

library(SPECIES)

##load data from the package, 
## "butterfly" is the famous butterfly data by Fisher 1943.

data(butterfly)


##output estimate without confidence interval using cutoff t=15
#unpmle(butterfly,t=15,C=0)

##output estimate with confidence interval using cutoff t=15
#unpmle(butterfly,t=15,C=1,b=200)

Run the code above in your browser using DataLab