Learn R Programming

admiral (version 1.2.0)

derive_vars_cat: Derive Categorization Variables Like AVALCATy and AVALCAyN


Derive Categorization Variables Like AVALCATy and AVALCAyN


derive_vars_cat(dataset, definition, by_vars = NULL)


The input dataset with the new variables defined in definition added



Input dataset

The variables specified by the by_vars and definition arguments are expected to be in the dataset.


List of expressions created by exprs(). Must be in rectangular format and specified using the same syntax as when creating a tibble using the tribble() function. The definition object will be converted to a tibble using tribble() inside this function.

Must contain:

  • the column condition which will be converted to a logical expression and will be used on the dataset input.

  • at least one additional column with the new column name and the category value(s) used by the logical expression.

  • the column specified in by_vars (if by_vars is specified)

e.g. if by_vars is not specified:

exprs(~condition,   ~AVALCAT1, ~AVALCA1N,
      AVAL >= 140, ">=140 cm",         1,
      AVAL < 140,   "<140 cm",         2)

e.g. if by_vars is specified as exprs(VSTEST):

exprs(~VSTEST,   ~condition,  ~AVALCAT1, ~AVALCA1N,
      "Height", AVAL >= 140, ">=140 cm",         1,
      "Height",  AVAL < 140,  "<140 cm",         2)


list of expressions with one element. NULL by default. Allows for specifying by groups, e.g. exprs(PARAMCD). Variable must be present in both dataset and definition. The conditions in definition are applied only to those records that match by_vars. The categorization variables are set to NA for records not matching any of the by groups in definition.


If conditions are overlapping, the row order of definitions must be carefully considered. The first match will determine the category. i.e. if

AVAL = 155

and the definition is:

definition <- exprs(
  ~VSTEST,   ~condition,  ~AVALCAT1, ~AVALCA1N,
  "Height",  AVAL > 170,  ">170 cm",         1,
  "Height", AVAL <= 170, "<=170 cm",         2,
  "Height", AVAL <= 160, "<=160 cm",         3

then AVALCAT1 will be "<=170 cm", as this is the first match for AVAL. If you specify:

definition <- exprs(
  ~VSTEST,   ~condition,  ~AVALCAT1, ~AVALCA1N,
  "Height", AVAL <= 160, "<=160 cm",         3,
  "Height", AVAL <= 170, "<=170 cm",         2,
  "Height",  AVAL > 170,  ">170 cm",         1

Then AVAL <= 160 will lead to AVALCAT1 == "<=160 cm", AVAL in-between 160 and 170 will lead to AVALCAT1 == "<=170 cm", and AVAL <= 170 will lead to AVALCAT1 == ">170 cm".

However, we suggest to be more explicit when defining the condition, to avoid overlap. In this case, the middle condition should be: AVAL <= 170 & AVAL > 160

See Also

General Derivation Functions for all ADaMs that returns variable appended to dataset: derive_var_extreme_flag(), derive_var_joined_exist_flag(), derive_var_merged_ef_msrc(), derive_var_merged_exist_flag(), derive_var_merged_summary(), derive_var_obs_number(), derive_var_relative_flag(), derive_vars_computed(), derive_vars_joined(), derive_vars_merged(), derive_vars_merged_lookup(), derive_vars_transposed()


Run this code

advs <- tibble::tribble(
  ~USUBJID,       ~VSTEST,  ~AVAL,
  "01-701-1015", "Height", 147.32,
  "01-701-1015", "Weight",  53.98,
  "01-701-1023", "Height", 162.56,
  "01-701-1023", "Weight",     NA,
  "01-701-1028", "Height",     NA,
  "01-701-1028", "Weight",     NA,
  "01-701-1033", "Height", 175.26,
  "01-701-1033", "Weight",  88.45

definition <- exprs(
  ~condition,                        ~AVALCAT1, ~AVALCA1N,  ~NEWCOL,
  VSTEST == "Height" & AVAL > 160,   ">160 cm",         1, "extra1",
  VSTEST == "Height" & AVAL <= 160, "<=160 cm",         2, "extra2"
  dataset = advs,
  definition = definition

# Using by_vars:
definition2 <- exprs(
  ~VSTEST,   ~condition,  ~AVALCAT1, ~AVALCA1N,
  "Height",  AVAL > 160,  ">160 cm",         1,
  "Height", AVAL <= 160, "<=160 cm",         2,
  "Weight",   AVAL > 70,   ">70 kg",         1,
  "Weight",  AVAL <= 70,  "<=70 kg",         2

  dataset = advs,
  definition = definition2,
  by_vars = exprs(VSTEST)

# With three conditions:
definition3 <- exprs(
  ~VSTEST,                ~condition,  ~AVALCAT1, ~AVALCA1N,
  "Height",               AVAL > 170,  ">170 cm",         1,
  "Height", AVAL <= 170 & AVAL > 160, "<=170 cm",         2,
  "Height",              AVAL <= 160, "<=160 cm",         3

  dataset = advs,
  definition = definition3,
  by_vars = exprs(VSTEST)

# Let's derive both the MCRITyML and the MCRITyMN variables
adlb <- tibble::tribble(
  "01-701-1015", "ALT",   150,  "U/L",      40,
  "01-701-1023", "ALT",    70,  "U/L",      40,
  "01-701-1036", "ALT",   130,  "U/L",      40,
  "01-701-1048", "ALT",    30,  "U/L",      40,
  "01-701-1015", "AST",    50,  "U/L",      35

definition_mcrit <- exprs(
  ~PARAM,                      ~condition,    ~MCRIT1ML, ~MCRIT1MN,
  "ALT",                    AVAL <= ANRHI,    "<=ANRHI",         1,
  "ALT", ANRHI < AVAL & AVAL <= 3 * ANRHI, ">1-3*ANRHI",         2,
  "ALT",                 3 * ANRHI < AVAL,   ">3*ANRHI",         3

adlb %>%
    definition = definition_mcrit,
    by_vars = exprs(PARAM)

Run the code above in your browser using DataLab