Computes the sum of products needed for the variance of survey sample
estimators. svyCprod
is used for survey design objects from
before version 2.9, onestage
is called by svyrecvar
for post-2.9 design objects.
svyCprod(x, strata, psu, fpc, nPSU,certainty=NULL, postStrata=NULL,
lonely.psu=getOption("survey.lonely.psu"))
onestage(x, strata, clusters, nPSU, fpc,
lonely.psu=getOption("survey.lonely.psu"),stage=0,cal)
A covariance matrix
A vector or matrix
A vector of stratum indicators (may be NULL
for svyCprod
)
A vector of cluster indicators (may be NULL
)
A vector of cluster indicators
A data frame (svyCprod
) or vector (onestage
)
of population stratum sizes, or NULL
Table (svyprod
) or vector (onestage
)
of original sample stratum sizes (or NULL
)
logical vector with stratum names as names. If
TRUE
and that stratum has a single PSU it is a certainty PSU
Post-stratification variables
One of "remove"
, "adjust"
,
"fail"
, "certainty"
, "average"
. See Details
below
Used internally to track the depth of recursion
Used to pass calibration information at stages below the population
Thomas Lumley
The observations for each cluster are added, then centered within each stratum and the outer product is taken of the row vector resulting for each cluster. This is added within strata, multiplied by a degrees-of-freedom correction and by a finite population correction (if supplied) and added across strata.
If there are fewer clusters (PSUs) in a stratum than in the original
design extra rows of zeroes are added to x
to allow the correct
subpopulation variance to be computed.
See postStratify
for information about
post-stratification adjustments.
The variance formula gives 0/0 if a stratum contains only one sampling
unit. If the certainty
argument specifies that this is a PSU
sampled with probability 1 (a "certainty" PSU) then it does not
contribute to the variance (this is correct only when there is no
subsampling within the PSU -- otherwise it should be defined as a
pseudo-stratum). If certainty
is FALSE
for
this stratum or is not supplied the result depends on lonely.psu
.
The options are "fail"
to give an error, "remove"
or
"certainty"
to give a variance contribution of 0 for the stratum,
"adjust"
to center the stratum at the grand mean rather than the
stratum mean, and "average"
to assign strata with one PSU the
average variance contribution from strata with more than one PSU. The
choice is controlled by setting options(survey.lonely.psu)
. If
this is not done the factory default is "fail"
. Using
"adjust"
is conservative, and it would often be better to combine
strata in some intelligent way. The properties of "average"
have
not been investigated thoroughly, but it may be useful when the lonely
PSUs are due to a few strata having PSUs missing completely at random.
The "remove"
and "certainty"
options give the same result,
but "certainty"
is intended for situations where there is only
one PSU in the population stratum, which is sampled with certainty (also
called `self-representing' PSUs or strata). With "certainty"
no
warning is generated for strata with only one PSU. Ordinarily,
svydesign
will detect certainty PSUs, making this option
unnecessary.
For strata with a single PSU in a subset (domain) the variance formula
gives a value that is well-defined and positive, but not typically
correct. If options("survey.adjust.domain.lonely")
is TRUE
and options("survey.lonely.psu")
is "adjust"
or
"average"
, and no post-stratification or G-calibration has been
done, strata with a single PSU in a subset will be treated like those
with a single PSU in the sample. I am not aware of any theoretical
study of this procedure, but it should at least be conservative.
Binder, David A. (1983). On the variances of asymptotically normal estimators from complex surveys. International Statistical Review, 51, 279- 292.
svydesign
, svyrecvar
, surveyoptions
, postStratify