unitVar: Compute the unit (population) variance for a variable

Description

Compute the unit (population) variance for a variable based on either a full population file or a sample from a finite population.

Usage

unitVar(pop.sw = NULL, w = NULL, p = NULL, y = NULL)

Value

A list with three or four components:

Note: Describes whether output was computed from a full population or estimated from a sample.
Pop size N: Size of the population; included if y is for the full population.
S2: Unit variance of y; if pop.sw = TRUE, S2 is computed from the full population; if pop.sw = FALSE, S2 is estimated from the sample using the w weights.
V1: Population variance of y appropriate for a sample selected with varying probabilities; see Valliant, Dever, and Kreuter (VDK; 2018, sec. 3.4). If pop.sw = TRUE and p is provided, V1 is computed with equation (3.32) in VDK. If pop.sw = FALSE, V1 is estimated with equation (3.41) in VDK.

Arguments

pop.sw: TRUE if the full population is input; FALSE if a sample is input
w: vector of sample weights if y is a sample; used only if pop.sw = FALSE
p: vector of 1-draw selection probabilities; optionally provided if pop.sw = TRUE
y: vector of values of an analysis variable; must be numeric

Author

Richard Valliant

Details

unitVar computes unit (population) variances of an analysis variable \(y\) from either a population or a sample. S2 is the unweighted population variance, \(S^2 = \sum_{i \in U}(y_i - \bar{y}_U)^2/(N-1)\) where \(U\) is the universe of elements, \(N\) is the population size, and \(\bar{y}_U\) is the population mean. If the input is a sample, S2 is estimated as \(\hat{S}^2 = (n/(n-1))\sum_{i \in s} w_i(y_i - \bar{y}_w)^2/(\sum_{i \in s} w_i)\) where \(s\) is the set of sample elements, \(n\) is the sample size, and \(\bar{y}_w\) is the weighted sample mean.

V1 is a weighted population variance used in calculations for samples where elements are selected with varying probabilities. If the \(y\) is a population vector, \(V_1 = \sum_U p_i(y_i/p_i - t_U)^2\) where \(p_i\) is the 1-draw probability for element \(i\) and \(t_U\) is the population total of \(y\). If \(y\) is for a sample, \(\hat{V}_1 = \sum_s (y_i/p_i - n^{-1}\sum_k y_k/p_k)^2 / (n-1)\) with \(p_i\) computed as \(1/(n w_i)\).

References

Valliant, R., Dever, J., Kreuter, F. (2018, chap. 3). Practical Tools for Designing and Weighting Survey Samples, 2nd edition. New York: Springer.

Examples

Run this code

library(PracTools)
data("smho.N874")
y <- smho.N874[,"EXPTOTAL"]
x <- smho.N874[, "BEDS"]
y <- y[x>0]
x <- x[x>0]
pik <- x/sum(x)
require(sampling)
n <- 50
sam <- UPrandomsystematic(n * pik)
wts <- 1/(n*pik[sam==1])
unitVar(pop.sw = TRUE, w = NULL, p = pik, y=y)
unitVar(pop.sw = FALSE, w = wts, p = NULL, y=y[sam==1])

Run the code above in your browser using DataLab