This is modelled entirely on a single chunk of SAS code, but hopefully can be generalised. It relies heavily on lists and regular expression, but, as you will see from the code, R is not a great language with which to write a SAS parser.
sas_format_extract(sas_lines)sas_format_extract_rcomfmt(sas_lines)
is a character vector, with one item per line, e.g. from
readLines
list (of lists)
sas_format_extract_rcomfmt
: Get just the $RCOMFMT
assignment, which
contains all the ICD (not DRG) data. The problem is RENLFAIL
appears
twice:
"N183", "N184", "N185", "N186", "N189", "N19", "Z4901", "Z4902",
"Z9115", "Z940", "Z992"="RENLFAIL" /*Dependence on renal dialysis*/
"Z4931", "Z4932"="RENLFAIL" /*Encounter for adequacy testing for peritoneal dialysis*/
so RENLFAIL
needs special treatment
http://support.sas.com/documentation/cdl/en/proc/61895/HTML/default/viewer.htm#a002473474.htm