Learn R Programming

labelmachine

labelmachine is an R package that helps assigning meaningful labels to data sets. Furthermore, you can manage your labels in so called lama-dictionary files, which are yaml files. This makes it very easy using the same label translations in multiple projects which share similar data structure.

Labeling your data can be easy!

Installation

# Install release version from CRAN
install.packages("labelmachine")

# Install development version from GitHub
devtools::install_github('a-maldet/labelmachine', build_vignettes = TRUE)

Concept

The label assignments are given in so called translations (named character vectors), which are like a recipes, telling which original value will be mapped onto which new label. The translations are collected in so called lama_dictionary objects. This lama_dictionary objects will be used to translate your data frame variables.

Usage

Let df be a data frame with marks and subjects, which should be translated

df <- data.frame(
  pupil_id = c(1, 1, 2, 2, 3),
  subject = c("en", "ma", "ma", "en", "en"),
  result = c(2, 1, 3, 2, NA),
  stringsAsFactors = FALSE
)
df
##   pupil_id subject result
## 1        1      en      2
## 2        1      ma      1
## 3        2      ma      3
## 4        2      en      2
## 5        3      en     NA

Create a lama_dictionary object holding the translations:

library(labelmachine)
dict <- new_lama_dictionary(
  subjects = c(en = "English", ma = "Mathematics", NA_ = "other subjects"),
  results = c("1" = "Excellent", "2" = "Satisfying", "3" = "Failed", NA_ = "Missed")
)
dict
## 
## --- lama_dictionary ---
## Variable 'subjects':
##               en               ma              NA_ 
##        "English"    "Mathematics" "other subjects" 
## 
## Variable 'results':
##            1            2            3          NA_ 
##  "Excellent" "Satisfying"     "Failed"     "Missed"

Translate the data frame variables:

df_new <- lama_translate(
  df,
  dict,
  subject_new = subjects(subject),
  result_new = results(result)
)
str(df_new)
## 'data.frame':    5 obs. of  5 variables:
##  $ pupil_id   : num  1 1 2 2 3
##  $ subject    : chr  "en" "ma" "ma" "en" ...
##  $ result     : num  2 1 3 2 NA
##  $ subject_new: Factor w/ 3 levels "English","Mathematics",..: 1 2 2 1 1
##  $ result_new : Factor w/ 4 levels "Excellent","Satisfying",..: 2 1 3 2 4

Highlights

labelmachine offers the following features:

  • All types of variables can be translated: Logical, Numeric, Character, Factor
  • When translating your variables, you may choose between keeping the current ordering or applying a new factor ordering to your variable.
  • Assigning meaningful labels to missing values (NA) is no problem.
  • Assigning NA to existing values is no problem.
  • Merging two values into a single label is no problem.
  • Transforming a data frame holding label assignment lists into a lama_dictionary is no problem.
  • Manage your translations in yaml files in order to use the same translations in different projects sharing similar data.

Further reading

A short introduction can be found here: Get started

Copy Link

Version

Install

install.packages('labelmachine')

Monthly Downloads

135

Version

1.0.0

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Adrian Maldet

Last Published

October 11th, 2019

Functions in labelmachine (1.0.0)

is.syntactic

Check if a variable name is syntactically valid
lama_merge

Merge multiple lama-dictionaries into one
lama_translate_all

Assign new labels to all variables of a data.frame
lama_get

lama_translate

Assign new labels to a variable of a data.frame
escape_to_na

Replace "NA_" by NA
check_and_translate_df

Checks arguments and translate a data.frame
composerr_

Compose error handlers (concatenate error messages)
new_lama_dictionary

Create a new lama_dictionary class object
dictionary_to_yaml

Transform data structure from lama_dictionary class input format to the yaml format
print.lama_dictionary

is.lama_dictionary

contains_na_escape

Check if a character vector contains NA replacement strings
lama_rename

na_to_escape

Replace NA by "NA_"
rename_translation

Function that actually performs the renaming of the translations
yaml_to_dictionary

named_lapply

Create a named list with lapply from a character vector
lama_mutate

stringify

Coerce a vector into a character string ('x1', 'x2', ...)
lama_read

Read in a yaml file holding translations for one or multiple variables
lama_write

Write a yaml file holding translations for one or multiple variables
lama_select

translate_df

This function relabels several variables in a data.frame
validate_lama_dictionary

lapplI

Improve lapply and sapply with index
translate_vector

This function relabels a vector
validate_translation

Check if an object has a valid translation structure
check_and_translate_all

Check and translate function used by lama_translate_all() and lama_to_factor_all()
check_and_translate_vector

Checks arguments and translate a vector
check_arguments

check_select

NA_lama_

NA replace string
check_and_translate_vector_

Checks arguments and translate a character vector (standard eval)
check_and_translate_df_

Checks arguments and translate a data.frame (standard eval)
check_rename

as.lama_dictionary