Learn R Programming

srvyr

srvyr brings parts of dplyr’s syntax to survey analysis, using the survey package.

srvyr focuses on calculating summary statistics from survey data, such as the mean, total or quantile. It allows for the use of many dplyr verbs, such as summarize, group_by, and mutate, the convenience of pipe-able functions, rlang’s style of non-standard evaluation and more consistent return types than the survey package.

You can try it out:

install.packages("srvyr")
# or for development version
# remotes::install_github("gergness/srvyr")

Example usage

First, describe the variables that define the survey’s structure with the function as_survey()with the bare column names of the names that you would use in functions from the survey package like survey::svydesign(), survey::svrepdesign() or survey::twophase().

library(srvyr, warn.conflicts = FALSE)
data(api, package = "survey")

dstrata <- apistrat %>%
   as_survey_design(strata = stype, weights = pw)

Now many of the dplyr verbs are available.

  • mutate() adds or modifies a variable.
dstrata <- dstrata %>%
  mutate(api_diff = api00 - api99)
  • summarise() calculates summary statistics such as mean, total, quantile or ratio.
dstrata %>% 
  summarise(api_diff = survey_mean(api_diff, vartype = "ci"))
#> # A tibble: 1 × 3
#>   api_diff api_diff_low api_diff_upp
#>      <dbl>        <dbl>        <dbl>
#> 1     32.9         28.8         37.0
  • group_by() and then summarise() creates summaries by groups.
dstrata %>% 
  group_by(stype) %>%
  summarise(api_diff = survey_mean(api_diff, vartype = "ci"))
#> # A tibble: 3 × 4
#>   stype api_diff api_diff_low api_diff_upp
#>   <fct>    <dbl>        <dbl>        <dbl>
#> 1 E        38.6         33.1          44.0
#> 2 H         8.46         1.74         15.2
#> 3 M        26.4         20.4          32.4
  • Functions from the survey package are still available:
my_model <- survey::svyglm(api99 ~ stype, dstrata)
summary(my_model)
#> 
#> Call:
#> svyglm(formula = api99 ~ stype, design = dstrata)
#> 
#> Survey design:
#> Called via srvyr
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)   635.87      13.34  47.669   <2e-16 ***
#> stypeH        -18.51      20.68  -0.895    0.372    
#> stypeM        -25.67      21.42  -1.198    0.232    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for gaussian family taken to be 16409.56)
#> 
#> Number of Fisher Scoring iterations: 2

Learning more

Here are some free resources put together by the community about srvyr:

Still need help?

I think the best way to get help is to form a specific question and ask it in some place like posit’s community website (known for it’s friendly community) or stackoverflow.com (maybe not known for being quite as friendly, but probably has more people). If you think you’ve found a bug in srvyr’s code, please file an issue on GitHub, but note that I’m not a great resource for helping specific issue, both because I have limited capacity but also because I do not consider myself an expert in the statistical methods behind survey analysis.

Have something to add?

These resources were mostly found via vanity searches on twitter & github. If you know of anything I missed, or have written something yourself, please let me know in this GitHub issue!

What people are saying about srvyr

minimal changes to my #r #dplyr script to incorporate survey weights, thanks to the amazing #srvyr and #survey packages. Thanks to @gregfreedman & @tslumley. Integrates soooo nicely into tidyverse

Brian Guay (@BrianMGuay on Jun 16, 2021)

Spending my afternoon using srvyr for tidy analysis of weighted survey data in #rstats and it’s so elegant. Vignette here: https://CRAN.R-project.org/package=srvyr/vignettes/srvyr-vs-survey.html

Chris Skovron (@cskovron on Nov 20, 2018)

  1. Yay!

Thomas Lumley, in the Biased and Inefficient blog

Contributing

I do appreciate bug reports, suggestions and pull requests! I started this as a way to learn about R package development, and am still learning, so you’ll have to bear with me. Please review the Contributor Code of Conduct, as all participants are required to abide by its terms.

If you’re unfamiliar with contributing to an R package, I recommend the guides provided by Rstudio’s tidyverse team, such as Jim Hester’s blog post or Hadley Wickham’s R packages book.

Copy Link

Version

Install

install.packages('srvyr')

Monthly Downloads

5,194

Version

1.3.0

License

GPL-2 | GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Last Published

August 19th, 2024

Functions in srvyr (1.3.0)

as_survey_design

Create a tbl_svy survey object using sampling design
as_survey_twophase

Create a tbl_svy survey object using two phase design
as_tibble

Coerce survey variables to a data frame (tibble)
as_survey_rep

Create a tbl_svy survey object using replicate weights
cascade

Summarise multiple values into cascading groups
interact

Create interaction terms to group by when summarizing
cur_svy

Get the survey data for the current context
collect

Force computation of a database query
as_srvyr_result_df

Create a srvyr results data.frame which is automatically unpacked by srvyr
survey_corr

Calculate correlation and its variation using survey methods
%>%

Pipe operator
survey_total

Calculate the total and its variation using survey methods
cur_svy_wts

Get the full-sample weights for the current context
tbl_svy

tbl_svy object.
group_by

Group a (survey) dataset by one or more variables.
reexports

Objects exported from other packages
summarise

Summarise multiple values to a single value.
rlang-tidyeval

Tidy eval helpers from rlang
survey_mean

Calculate mean/proportion and its variation using survey methods
uninteract

Break interaction vectors back into component columns
summarise_all

Manipulate multiple columns.
group_map_dfr

Apply a function to each group
srvyr

srvyr: A package for 'dplyr'-Like Syntax for Summary Statistics of Survey Data.
svychisq

Chisquared tests of association for survey data.
srvyr_interaction

srvyr interaction column
groups

Get/set the grouping variables for tbl.
set_survey_vars

Set the variables for the current survey variable
survey_ratio

Calculate the ratio and its variation using survey methods
survey_tally

Count/tally survey weighted observations by group
get_var_est

Get the variance estimates for a survey estimate
tbl_vars

List variables produced by a tbl.
srvyr-se-deprecated

Deprecated SE versions of main srvyr verbs
as_survey

Create a tbl_svy from a data.frame
dplyr_filter_joins

Filtering joins from dplyr
survey_quantile

Calculate the quantile and its variation using survey methods
group_trim

Single table verbs from dplyr and tidyr
survey_old_quantile

Calculate the quantile and its variation using survey methods
survey_var

Calculate the population variance and its variation using survey methods
unweighted

Calculate the an unweighted summary statistic from a survey