weighted_mean_winsorized: Weighted Winsorized Mean and Total (bare-bone functions)

Description

Weighted winsorized mean and total (bare-bone functions with limited functionality; see svymean_winsorized and svytotal_winsorized for more capable methods)

Usage

weighted_mean_winsorized(x, w, LB = 0.05, UB = 1 - LB, info = FALSE,
                         na.rm = FALSE)
weighted_mean_k_winsorized(x, w, k, info = FALSE, na.rm = FALSE)
weighted_total_winsorized(x, w, LB = 0.05, UB = 1 - LB, info = FALSE,
                          na.rm = FALSE)
weighted_total_k_winsorized(x, w, k, info = FALSE, na.rm = FALSE)

Value

The return value depends on info:

info = FALSE:

estimate of mean or total [double]

info = TRUE:

a [list] with items:

characteristic [character],
estimator [character],
estimate [double],
variance (default: NA),
robust [list],
residuals [numeric vector],
model [list],
design (default: NA),
[call]

Arguments

x: [numeric vector] data.
w: [numeric vector] weights (same length as x).
LB: [double] lower bound of winsorization such that \(0 \leq\) LB \(<\) UB \(\leq 1\).
UB: [double] upper bound of winsorization such that \(0 \leq\) LB \(<\) UB \(\leq 1\).
info: [logical] indicating whether additional information should be returned (default: FALSE).
na.rm: [logical] indicating whether NA values should be removed before the computation proceeds (default: FALSE).
k: [integer] number of observations to be winsorized at the top of the distribution.

Details

Characteristic.

Population mean or total. Let \(\mu\) denote the estimated winsorized population mean; then, the estimated population total is given by \(\hat{N} \mu\) with \(\hat{N} =\sum w_i\), where summation is over all observations in the sample.

Modes of winsorization.

The amount of winsorization can be specified in relative or absolute terms:

Relative: By specifying LB and UB, the methods winsorizes the LB\(~\cdot 100\%\) of the smallest observations and the (1 - UB)\(~\cdot 100\%\) of the largest observations from the data.
Absolute: By specifying argument k in the functions with the "infix" _k_ in their name, the largest \(k\) observations are winsorized, \(0<k<n\), where \(n\) denotes the sample size. E.g., k = 2 implies that the largest and the second largest observation are winsorized.

Variance estimation.

See survey methods:

svymean_winsorized,
svytotal_winsorized,
svymean_k_winsorized,
svytotal_k_winsorized.

Examples

Run this code

head(workplace)

# Estimated winsorized population mean (5% symmetric winsorization)
weighted_mean_winsorized(workplace$employment, workplace$weight, LB = 0.05)

# Estimated one-sided k winsorized population total (2 observations are
# winsorized at the top of the distribution)
weighted_total_k_winsorized(workplace$employment, workplace$weight, k = 2)