svyflow: Gross flow estimation between categories

Description

Compute gross flows for data from complex surveys with repeated samples.

Usage

# S3 method for survey.design2
svyflow(
  x,
  design,
  model = c("A", "B", "C", "D"),
  tol = 1e-04,
  maxit = 5000,
  verbose = FALSE,
  as.zero.flows = FALSE,
  influence = FALSE,
  ...
)
# S3 method for svyrep.design
svyflow(
  x,
  design,
  model = c("A", "B", "C", "D"),
  tol = 1e-04,
  maxit = 5000,
  verbose = FALSE,
  as.zero.flows = FALSE,
  influence = FALSE,
  ...
)

Arguments

a one-sided formula indicating a factor variable.

design

survey design object

model

Stasny (1987) model for the non-response process. Possibilities: "A", "B", "C", "D". Defaults to model = "A".

tol

Tolerance for iterative proportional fitting. Defaults to 1e-4.

maxit

Maximum number of iterations for iterative proportional fitting. Defaults to maxit = 5000.

verbose

Print proportional fitting iterations. Defaults to verbose = FALSE.

as.zero.flows

Should zeroes in the observed gross flows should be considered as zeroes in the population transition probability matrix? Defaults to as.zero.flows = FALSE.

influence

Should influence functions estimates be stored? Defaults to influence = FALSE.

...

future expansion.

Value

Objects of class flowstat, a list of svystat and svymstat (a matrix version of svystat) objects. The flowstat object contais estimates of: the initial response probababilities psi, the response/response transition probabilities rho, the non-response/non-response transition probabilities tau, the (non-response corrected) initial and final distributions across categories eta and gamma, the (non-response corrected) transition probability matrix pij, the (non-response corrected) gross flows matrix muij, and the vector of net flows delta. These objects have methods for coef, vcov, SE, and cv.

A Rao-Scott Corrected Chi^2 test is also calculated.

Details

It is important to distinguish "missing" responses from "unnaplicable" responses. This is feasible by subsetting the design for only applicable responses (with actual missing responses, if that is the case). For instance, suppose that we have two variables encoded as employed/unemployed, with NAs if the response is missing or is unnaplicable. An NA might be a person that did not respond or a person who was under the working-age at the time of the survey. It is important to distinguish across those, as only one of those cases is an actual non-response. You could do that by looking for people who were in working age in any round, for instance. This can be done by using subset, as you should for a survey design object.

References

STASNY, E. A. Some Markov-chain models for nonresponse in estimating gross labor force flows. Journal of Official Statistics, v. 3, n. 4, p. 359, 1987.

GUTIERREZ, H. A.; TRUJILLO, L.; SILVA, P. L. N. The estimation of gross flows in complex surveys with random nonresponse. Survey Methodology, v. 40, n. 2, p. 285<U+2013>321, dec. 2014. URL https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X201400214113.

LUMLEY, T. Complex Surveys: A guide to analysis using R. Hoboken: John Wiley & Sons, 2010. (Wiley Series in Survey Methodology). ISBN 978-0-470-28430-8.

Examples

Run this code

# NOT RUN {
# load library
library( survey )
library( surf )

# load data
data( "LFS79.0809" )

# create surf design object
lfs.des <- svydesign( ids = ~0 , probs = ~ prob , data = LFS79.0809 , nest = TRUE )

# flow estimates
estflows <- svyflow( ~y1+y2 , design = lfs.des )
coef( estflows$muij )
SE( estflows$muij )

# }

Run the code above in your browser using DataLab