pickFields: Pick SEER fields of interest

Description

Reduces the full set of SEER data fields to a smaller set of interest.

Usage

pickFields(sas,picks=c("casenum","reg","race","sex","agedx", "yrbrth","seqnum","modx","yrdx","histo3","radiatn",  "ICD9","COD","surv"))

Arguments

sas

A data frame created by getFields() using the SAS file found in the ‘incidence’ directory of seerHome, the root of the SEER ASCII data installation.

picks

Vector of names of variables of interest. These must be in the same order as found in the input data frame sas. Downstream, mkSEER will not work without a minimum of "reg","race","sex","agedx","histo3","radiatn","ICD9", but to estimate survival and second cancer risks the default is the minimum.

Value

sas, shortened to just the rows of picks, and expanded to include spacer rows of fields of no interest pooled into single strings: the width of such a spacer row is equal to the distance in bytes between the fields of interest above and below it. This data frame is then used by laf_open_fwf() of LaF in mkSEER() to read the SEER files. Proper use of this function, and of the SEER data in general, requires an understanding of the contents of ‘seerdic.pdf’ in the ‘incidence’ directory of seerHome.

Details

R binaries become too large if all of the fields are selected. SEERaBomb is faster than SEER*Stat because it tailors/streamlines the database to your interests. The default picks are a reasonable place to start; if you determine later that you need more fields, you can always rebuild the binaries. Grabbing all fields is discouraged, but if you want this anyway, note that you still need pickFields to create a data type column, i.e. you cannot bypass pickFields by sending the output of getFields straight to mkSEER.

Examples

Run this code

## Not run: 
# library(SEERaBomb)
# (df=getFields())
# (df=pickFields(df))
# 
# ## End(Not run)

Run the code above in your browser using DataLab