Compute gross flows for data from complex surveys with repeated samples.
# S3 method for survey.design2
svyflow(
x,
design,
model = c("A", "B", "C", "D"),
tol = 1e-04,
maxit = 5000,
verbose = FALSE,
as.zero.flows = FALSE,
influence = FALSE,
...
)# S3 method for svyrep.design
svyflow(
x,
design,
model = c("A", "B", "C", "D"),
tol = 1e-04,
maxit = 5000,
verbose = FALSE,
as.zero.flows = FALSE,
influence = FALSE,
...
)
a one-sided formula indicating a factor variable.
survey design object
Stasny (1987) model for the non-response process. Possibilities: "A", "B", "C", "D"
. Defaults to model = "A"
.
Tolerance for iterative proportional fitting. Defaults to 1e-4
.
Maximum number of iterations for iterative proportional fitting. Defaults to maxit = 5000
.
Print proportional fitting iterations. Defaults to verbose = FALSE
.
Should zeroes in the observed gross flows should be considered as zeroes in the population transition probability matrix? Defaults to as.zero.flows = FALSE
.
Should influence functions estimates be stored? Defaults to influence = FALSE
.
future expansion.
Objects of class flowstat
, a list of svystat
and svymstat
(a matrix version of svystat
) objects.
The flowstat
object contais estimates of: the initial response probababilities psi
, the response/response transition probabilities rho
,
the non-response/non-response transition probabilities tau
, the (non-response corrected) initial and final distributions across categories eta
and gamma
,
the (non-response corrected) transition probability matrix pij
, the (non-response corrected) gross flows matrix muij
, and the vector of net flows delta
.
These objects have methods for coef, vcov, SE, and cv.
A Rao-Scott Corrected Chi^2 test is also calculated.
It is important to distinguish "missing" responses from "unnaplicable" responses. This is feasible by subsetting the design
for only applicable responses (with actual missing responses, if that is the case). For instance, suppose that we have two variables encoded as
employed/unemployed, with NAs if the response is missing or is unnaplicable. An NA
might be a person that did not respond or a person
who was under the working-age at the time of the survey. It is important to distinguish across those, as only one of those cases is an
actual non-response. You could do that by looking for people who were in working age in any round, for instance. This can be done by using subset
,
as you should for a survey design
object.
STASNY, E. A. Some Markov-chain models for nonresponse in estimating gross labor force flows. Journal of Official Statistics, v. 3, n. 4, p. 359, 1987.
GUTIERREZ, H. A.; TRUJILLO, L.; SILVA, P. L. N. The estimation of gross flows in complex surveys with random nonresponse. Survey Methodology, v. 40, n. 2, p. 285<U+2013>321, dec. 2014. URL https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X201400214113.
LUMLEY, T. Complex Surveys: A guide to analysis using R. Hoboken: John Wiley & Sons, 2010. (Wiley Series in Survey Methodology). ISBN 978-0-470-28430-8.
# NOT RUN {
# load library
library( survey )
library( surf )
# load data
data( "LFS79.0809" )
# create surf design object
lfs.des <- svydesign( ids = ~0 , probs = ~ prob , data = LFS79.0809 , nest = TRUE )
# flow estimates
estflows <- svyflow( ~y1+y2 , design = lfs.des )
coef( estflows$muij )
SE( estflows$muij )
# }
Run the code above in your browser using DataLab