Learn R Programming

EnvStats (version 2.1.0)

ezmnorm: Estimate Parameters of a Zero-Modified Normal Distribution

Description

Estimate the mean and standard deviation of a zero-modified normal distribution, and optionally construct a confidence interval for the mean.

Usage

ezmnorm(x, method = "mvue", ci = FALSE, ci.type = "two-sided", 
    ci.method = "normal.approx", conf.level = 0.95)

Arguments

x
numeric vector of observations.
method
character string specifying the method of estimation. Currently, the only possible value is "mvue" (minimum variance unbiased; the default). See the DETAILS section for more information.
ci
logical scalar indicating whether to compute a confidence interval for the mean. The default value is FALSE.
ci.type
character string indicating what kind of confidence interval to compute. The possible values are "two-sided" (the default), "lower", and "upper". This argument is ignored if ci=FALSE.
ci.method
character string indicating what method to use to construct the confidence interval for the mean. Currently the only possible value is "normal.approx" (the default). See the DETAILS section for more information.
conf.level
a scalar between 0 and 1 indicating the confidence level of the confidence interval. The default value is conf.level=0.95. This argument is ignored if ci=FALSE.

Value

  • a list of class "estimate" containing the estimated parameters and other information. See estimate.object for details. The component called parameters is a numeric vector with the following estimated parameters: ll{ Parameter Name Explanation mean mean of the normal (Gaussian) part of the distribution. sd standard deviation of the normal (Gaussian) part of the distribution. p.zero probability that an observation will be 0. mean.zmnorm mean of the overall zero-modified normal distribution. sd.zmnorm standard deviation of the overall normal distribution. }

Details

If x contains any missing (NA), undefined (NaN) or infinite (Inf, -Inf) values, they will be removed prior to performing the estimation. Let $\underline{x} = (x_1, x_2, \ldots, x_n)$ be a vector of $n$ observations from a zero-modified normal distribution with parameters mean=$\mu$, sd=$\sigma$, and p.zero=$p$. Let $r$ denote the number of observations in $\underline{x}$ that are equal to 0, and order the observations so that $x_1, x_2, \ldots, x_r$ denote the $r$ zero observations, and $x_{r+1}, x_{r+2}, \ldots, x_n$ denote the $n-r$ non-zero observations. Note that $\mu$ is not the mean of the zero-modified normal distribution; it is the mean of the normal part of the distribution. Similarly, $\sigma$ is not the standard deviation of the zero-modified normal distribution; it is the standard deviation of the normal part of the distribution. Let $\gamma$ and $\delta$ denote the mean and standard deviation of the overall zero-modified normal distribution. Aitchison (1955) shows that: $$\gamma = (1 - p) \mu \;\;\;\; (1)$$ $$\delta^2 = (1 - p) \sigma^2 + p (1 - p) \mu^2 \;\;\;\; (2)$$ Estimation Minimum Variance Unbiased Estimation (method="mvue") Aitchison (1955) shows that the minimum variance unbiased estimators (mvue's) of $\gamma$ and $\delta$ are: $$\hat{\gamma}_{mvue} = \bar{x} \;\;\;\; (3)$$ lll{ $\hat{\delta}^2_{mvue} =$ $\frac{n-r-1}{n-1} (s^*)^2 + \frac{r}{n} (\frac{n-r}{n-1}) (\bar{x}^*)^2$ if $r < n - 1$, $x_n^2 / n$ if $r = n - 1$, $0$ if $r = n \;\;\;\; (4)$ } where $$\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i \;\;\;\; (5)$$ $$\bar{x}^* = \frac{1}{n-r} \sum_{i=r+1}^n x_i \;\;\;\; (6)$$ $$(s^*)^2 = \frac{1}{n-r-1} \sum_{i=r+1}^n (x_i - \bar{x}^*)^2 \;\;\;\; (7)$$ Note that the quantity in equation (5) is the sample mean of all observations (including 0 values), the quantity in equation (6) is the sample mean of all non-zero observations, and the quantity in equation (7) is the sample variance of all non-zero observations. Also note that for $r=n-1$ or $r=n$, the estimator of $\delta^2$ is the sample variance for all observations (including 0 values). Confidence Intervals Based on Normal Approximation (ci.method="normal.approx") An approximate $(1-\alpha)100%$ confidence interval for $\gamma$ is constructed based on the assumption that the estimator of $\gamma$ is approximately normally distributed. Aitchison (1955) shows that $$Var(\hat{\gamma}_{mvue}) = Var(\bar{x}) = \frac{\delta^2}{n} \;\;\;\; (8)$$ Thus, an approximate two-sided $(1-\alpha)100%$ confidence interval for $\gamma$ is constructed as: $$[ \hat{\gamma}_{mvue} - t_{n-2, 1-\alpha/2} \frac{\hat{\delta}_{mvue}}{\sqrt{n}}, \; \hat{\gamma}_{mvue} + t_{n-2, 1-\alpha/2} \frac{\hat{\delta}_{mvue}}{\sqrt{n}} ] \;\;\;\; (9)$$ where $t_{\nu, p}$ is the $p$'th quantile of Student's t-distribution with $\nu$ degrees of freedom. One-sided confidence intervals are computed in a similar fashion.

References

Aitchison, J. (1955). On the Distribution of a Positive Random Variable Having a Discrete Probability Mass at the Origin. Journal of the American Statistical Association 50, 901--908. Gilliom, R.J., and D.R. Helsel. (1986). Estimation of Distributional Parameters for Censored Trace Level Water Quality Data: 1. Estimation Techniques. Water Resources Research 22, 135--146. Owen, W., and T. DeRouen. (1980). Estimation of the Mean for Lognormal Data Containing Zeros and Left-Censored Values, with Applications to the Measurement of Worker Exposure to Air Contaminants. Biometrics 36, 707--719. USEPA (1992c). Statistical Analysis of Ground-Water Monitoring Data at RCRA Facilities: Addendum to Interim Final Guidance. Office of Solid Waste, Permits and State Programs Division, US Environmental Protection Agency, Washington, D.C.

See Also

ZeroModifiedNormal, Normal, ezmlnorm, ZeroModifiedLognormal, estimate.object.

Examples

Run this code
# Generate 100 observations from a zero-modified normal distribution 
  # with mean=4, sd=2, and p.zero=0.5, then estimate the parameters.  
  # According to equations (1) and (2) above, the overall mean is 
  # mean.zmnorm=2 and the overall standard deviation is sd.zmnorm=sqrt(6).  
  # (Note: the call to set.seed simply allows you to reproduce this example.)

  set.seed(250) 
  dat <- rzmnorm(100, mean = 4, sd = 2, p.zero = 0.5) 
  ezmnorm(dat, ci = TRUE) 

  #Results of Distribution Parameter Estimation
  #--------------------------------------------
  #
  #Assumed Distribution:            Zero-Modified Normal
  #
  #Estimated Parameter(s):          mean        = 4.037732
  #                                 sd          = 1.917004
  #                                 p.zero      = 0.450000
  #                                 mean.zmnorm = 2.220753
  #                                 sd.zmnorm   = 2.465829
  #
  #Estimation Method:               mvue
  #
  #Data:                            dat
  #
  #Sample Size:                     100
  #
  #Confidence Interval for:         mean.zmnorm
  #
  #Confidence Interval Method:      Normal Approximation
  #                                 (t Distribution)
  #
  #Confidence Interval Type:        two-sided
  #
  #Confidence Level:                95%
  #
  #Confidence Interval:             LCL = 1.731417
  #                                 UCL = 2.710088

  #----------

  # Following Example 9 on page 34 of USEPA (1992c), compute an 
  # estimate of the mean of the zinc data, assuming a 
  # zero-modified normal distribution. The data are stored in 
  # EPA.92c.zinc.df.

  head(EPA.92c.zinc.df) 
  #  Zinc.orig  Zinc Censored Sample Well
  #1        <7  7.00     TRUE      1    1
  #2     11.41 11.41    FALSE      2    1
  #3        <7  7.00     TRUE      3    1
  #4        <7  7.00     TRUE      4    1
  #5        <7  7.00     TRUE      5    1
  #6     10.00 10.00    FALSE      6    1

  New.Zinc <- EPA.92c.zinc.df$Zinc 
  New.Zinc[EPA.92c.zinc.df$Censored] <- 0 
  ezmnorm(New.Zinc, ci = TRUE) 

  #Results of Distribution Parameter Estimation
  #--------------------------------------------
  #
  #Assumed Distribution:            Zero-Modified Normal
  #
  #Estimated Parameter(s):          mean        = 11.891000
  #                                 sd          =  1.594523
  #                                 p.zero      =  0.500000
  #                                 mean.zmnorm =  5.945500
  #                                 sd.zmnorm   =  6.123235
  #
  #Estimation Method:               mvue
  #
  #Data:                            New.Zinc
  #
  #Sample Size:                     40
  #
  #Confidence Interval for:         mean.zmnorm
  #
  #Confidence Interval Method:      Normal Approximation
  #                                 (t Distribution)
  #
  #Confidence Interval Type:        two-sided
  #
  #Confidence Level:                95%
  #
  #Confidence Interval:             LCL = 3.985545
  #                                 UCL = 7.905455

  #----------

  # Clean up
  rm(dat, New.Zinc)

Run the code above in your browser using DataLab