Learn R Programming

micromapST (version 3.1.1)

USSeerBG: USSeerBG border group datasets to support use with U.S. 20 Seer areas/registries

Description

The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call argument is used to specify the border group dataset for the geographical area. The USSeerBG border group dataset supports creating linked micromaps for the 20 Seer registries in the U. S. When the bordGrp call argument is set to USSeerBG, the appropriate name table (county names and abbreviations) and the 20 sub-areas (Seer registries) boundary data is loaded in micromapST. The user's data is then linked to the boundary data via the Seer registry's name, abbreviated, alias match or ID based on the table below.

The 20 U. S. Seer registries are the accepted registries as of January 2010 funded by NCI.

Usage

data(USSeerBG)

Arguments

Details

The USSeerBG border group dataset contains the following data.frames:

areaParms

- contains specific parameters for the border group

areaNamesAbbrsIDs

- containing the names, abbreviations, numerical identifier and alias matching string for each of the 20 Seer registries.

areaVisBorders

- the boundary point lists for each area.

L2VisBorders

- the boundaries for an intermediate level. For Seer registry border group, L2VisBorders contains the boundaries for the 51 states and DC in the U. S to help provide a geographical reference of the registries to the states.

RegVisBorders

- the boundaries for the 4 U. S. Census regions in the U. S in support of the region feature.

L3VisBorders

- the boundary of the U. S.

The Seer Registries border group contains 20 Seer Registry sub-areas. Each registry has a row in the areaNamesAbbrsIDs data.frame and a set of polygons in the areaVisBorders data.frame datasets.

Regions are defined in this border group as the 4 census regions in the U. S. The regions feature is enable. The four census regions are: NorthEast, South, MidWest, and West. The states and Seer registries in each region are:

stateSeer Registriesregion
Alabama<none>South
AlaskaAlaska NativesWest
ArizonaArizona NativesWest
Arkansas<none>South
CaliforniaCalifornia-LA,West
California-Other,
California-SF,
California-SJ
Colorado<none>West
ConnecticutConnecticutNorthEast
Delaware<none>South
District of Columbia<none>South
Florida<none>South
GeorgiaGeorgia-Atlanta,South
Georgia-Other,
Georgia-Rural
HawaiiHawaiiWest
Idaho<none>West
Illinois<none>MidWest
Indiana<none>MidWest
IowaIowaMidWest
Kansas<none>MidWest
KentuckyKentuckySouth
LouisianaLouisianaSouth
Maine<none>NorthEast
Maryland<none>South
Massachusetts<none>NorthEast
MichiganMichigan-DetroitMidWest
Minnesota<none>MidWest
Mississippi<none>South
Missouri<none>MidWest
Montana<none>West
Nebraska<none>MidWest
Nevada<none>West
New Hampshire<none>NorthEast
New JerseyNew JerseyNorthEast
New MexicoNew MexicoWest
New York<none>NorthEast
North Carolina<none>South
North Dakota<none>MidWest
Ohio<none>MidWest
OklahomaOklahoma-CherokeeSouth
Oregon<none>West
Pennsylvania<none>NorthEast
Rhode Island<none>NorthEast
South Carolina<none>South
South Dakota<none>MidWest
Tennessee<none>South
Texas<none>South
UtahUtahWest
Vermont<none>NorthEast
Virginia<none>South
WashingtonWashington-SeattleSouth
West Virginia<none>South
Wisconsin<none>MidWest
Wyoming<none>West

The L3VisBorders dataset contains the outline of the United States.

The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area (registry) using the fullname, abbreviation, and numerical identifier for each country to the <statsDFrame> data based on the setting of the rowNames call argument.

A column or the data.frame row.names must match one of the types of names in the areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area (registry) in the name table, the sub-area (registry) is mapped but not colored.

The following are a list of the names, abbreviations, alias and IDs for each country in the USSeerBG border group.

Nameabalias stringidcountiesregion
Alaska NativesAK-NATALASKA NATIVES18allWest
Arizona NativesAZ-NATARIZONA NATIVES20allWest
California-LACA-LALOS ANGELES4Los AngelesWest
California-SFCA-SFSAN FRANCISCO2Alameda,West
Contra Costa,
Marin,
San Francisco,
San Mateo
California-SJCA-SJSAN JOSE3MonterseyWest
San Benito,
Santa Clara,
Santa Cruz
California-OtherCA-OTHCALIFORNIA EXCLUDING5all other countiesWest
ConnecticutCTCONNECTICUT1allNorthEast
Georgia-AtlantaGA-ATLATLANTA6Clayton, Cobb, DeKalb,South
Fulton, Gwinnett
Georgia-RuralGA-RURRURAL GEORGIA8Glascock, Greene, Hancock,South
Jasper, Jefferson, Morgan,
Putnam, Taliaferro, Warren,
Washington
Georgia-OtherGA-OTHGREATER GEORGIA7all other countiesSouth
HawaiiHIHAWAII9allWest
IowaIAIOWA10allMidWest
KentuckyKYKENTUCKY14allSouth
Michigan-DetroitMI-DETDETROIT15Macomb,MidWest
Oakland,
Wayne
New JerseyNJNEW JERSEY11allNorthEast
New MexicoNMNEW MEXICO12allWest
Oklahoma-CherokeeOK-CHEOKLAHOMA19Adair,South
Cherokee,
Craig,
Delaware,
Mayes,
McIntosh,
Muskogee,
Nowata,
Ottawa,
Rogers,
Seqouyah,
Tulsa,
Wagnorer,
Washington
UtahUTUTAH16allWest
Washington-SeattleWA-SEASEATTLE17Clallam,South
Grays Harbor,
Island,
Jefferson,
King,
Kitsap,
Mason,
Pierce,
San Juan,
Skagit,
Snohomish,
Thurston,
Whatcom

The rowNames = alias and the regions = TRUE features are enabled in the USSeerBG border group.

The alias option is designed to allow the package to match the registry labels created by the Seer Stat website when exporting Seer data for analysis. The alias match is a "contains" match, so the registry field in the user data must "contain" the "alias" values listed in the above table. To help generalize the match, the user's registry value is stripped of any punctuation, control characters and multiple spaces (blanks, tabs, cr, lf) are reduced to a single blank and the string is converted to all upper case. Then the wild card match is performed.

The dataRegionOnly call parameter (when set to TRUE) instructs the package to only map the regions with Seer registers with data. The regions used are the four census regions: NorthEast, South, MidWest and West. The RegVisBorders data.frame contains the outline of each of these regions. For example: if Seer registry data is provided for the only the New Mexico, Utah and California Registries in the West region, then only the states and regional boundary for the West region are drawn.

The USSeerBG border group does not contain or support an alternate set of abbreviations. If rowNames is set to alt_ab, an warning is generated and the standard Seer registry abbreviations are used.

The following steps should be used to export data for micromapST's use from the SEER*Stat Website:

  1. Log on to the SEER^Stat website.

  2. Create the matrix of results you want in SEER*Stat.

  3. Click on Matrix, Export, Results as Text File (if you created multiple matrices of results, make sure that the one you want to export is highlighted)

  4. In the Matrix Export Options window, click on:

    1. Output variables as Labels without quotes

    2. Remove all thousands separators

    3. Output variable names before data

    4. Preserve matrix columns & rename fields

    5. Leave defaults clicked for Line delimiter, Missing Character, and Field delimiter

  5. Change names and locations of text and dictionary files from defaults to the appropriate name and directory location.

To read the resulting text file into R use the read.delim function with header = TRUE. Follow the read.delim call with a str function to verify the data was read correctly.


     dataT <- read.delim("c:\datadir\seerstat.txt",header=FALSE)
     str(dataT)
  

References

United States National Cancer Institute Seer Website at www.seer.cancer.gov; Seer Software at seer.cancer.gov/seerstat.; United States Census Bureau, Geography Division. "Census Regions and Divisions of the United States" (PDF). Retrieved 2013-01-10.