Learn R Programming

icd (version 3.3)

categorize_simple: Categorize codes according to a mapping

Description

This is the function which prepares the input data for the categorization, and forms the core of the package, along with the C++ matrix code. This is pure data manipulation and generalizable beyond medical data.

Usage

categorize_simple(x, map, id_name, code_name, return_df = FALSE,
  return_binary = FALSE, restore_id_order = TRUE, unique_ids = FALSE,
  preserve_id_type = FALSE, comorbid_fun = comorbidMatMulSimple)

Arguments

x

Data frame containing a column for an 'id' and a column for a code, e.g., an ICD-10 code.

map

named list containing vectors of ICD-9 codes. E.g. the AHRQ ICD-9 comorbidities, contains list(OBESE = c("2780", "27800", "27801", "27803", "V8554", "79391", "64910", "64911", "64912", "64913", "64914", "V8530", "V8531", "V8532", "V8533", "V8534", "V8535", "V8536", "V8537", "V8538", "V8539", "V8541", "V8542", "V8543", "V8544", "V8545" ), DEPRESS = c("3004", "30112", "3090", "3091", "311")) amongst other longer groups.

id_name

The name of the data.frame field which is the unique identifier.

code_name

String with name of column containing the codes.

return_df

single logical value, if TRUE, return the result as a data frame with the first column being the visit_id, and the second being the count. If visit_id was a factor or named differently in the input, this is preserved.

return_binary

Logical value, if TRUE, the output will be in 0s and 1s instead of TRUE and FALSE.

restore_id_order

Logical value, if TRUE, the default, the order of the visit IDs will match the order of visit IDs first encountered in the input data. This takes a third of the time in calculations on data with tens of millions of rows, so, if the visit IDs will be discarded when summarizing data, this can be set to FALSE for a big speed-up.

unique_ids

Single logical value, if TRUE then the visit IDs in column given by id_name are assumed to be unique. Otherwise, the default action is to ensure they are unique.

preserve_id_type

Single logical value, if TRUE, the visit ID column will be converted back to its original type. The default of FALSE means only factors and character types are restored in the returned data frame. For matrices, the row names are necessarily stored as character vectors.

comorbid_fun

function i.e. the function symbol (not character string) to be called to do the comorbidity calculation

Examples

Run this code
# NOT RUN {
u <- uranium_pathology
m <- icd10_map_ahrq
u$icd10 <- decimal_to_short(u$icd10)
j <- categorize_simple(u, m, id_name = "case", code_name = "icd10")
# }

Run the code above in your browser using DataLab