Learn R Programming

editrules (version 2.9.5)

editrules_package: An overview of the function of package editrules

Description

Please note: active development has moved to packages 'validate' and 'errorlocate'. Facilitates reading and manipulating (multivariate) data restrictions (edit rules) on numerical and categorical data. Rules can be defined with common R syntax and parsed to an internal (matrix-like format). Rules can be manipulated with variable elimination and value substitution methods, allowing for feasibility checks and more. Data can be tested against the rules and erroneous fields can be found based on Fellegi and Holt's generalized principle. Rules dependencies can be visualized with using the 'igraph' package.

Arguments

NOTE

This package is no longer under active development. The package is superseded by R packages validate for data validation and errorlocate for error localization. We urge new users to use those packages instead.

The editrules package aims to provide an environment to conveniently define, read and check recordwise data constraints including

  • Linear (in)equality constraints for numerical data,

  • Constraints on value combinations of categorical data

  • Conditional constraints on numerical and/or mixed data

In literature these constraints, or restrictions are refered to as ``edits''. editrules can perform common rule set manipulations like variable elimination and value substitution, and offers error localization functionality based on the (generalized) paradigm of Fellegi and Holt. Under this paradigm, one determines the smallest (weighted) number of variables to adapt such that no (additional or derived) rules are violated. The paradigm is based on the assumption that errors are distributed randomly over the variables and there is no detectable cause of error. It also decouples the detection of corrupt variables from their correction. For some types of error, such as sign flips, typing errors or rounding errors, this assumption does not hold. These errors can be detected and are closely related to their resolution. The reader is referred to the deducorrect package for treating such errors.

I. Define edits

editrules provides several methods for creating edits from a character , expression, data.frame or a text file.

editfileRead conditional numerical, numerical and categorical constraints from textfile
editsetCreate conditional numerical, numerical and categorical constraints
editmatrixCreate a linear constraint matrix for numerical data
editarrayCreate value combination constraints for categorical data

II. Check and find errors in data

editrules provides several method for checking data.frames with edits

violatedEditsFind out which record violates which edit.
localizeErrorsLocalize erroneous fields using Fellegi and Holt's principle.
errorLocalizerLow-level error localization function using B&B algorithm

Note that you can call plot, summary and print on results of these functions.

IV. Manipulate and check edits

editrules provides several methods for manipulating edits

substValueSubstitute a value in a set of rules
eliminateDerive implied rules by variable elimination
reduceRemove unconstraint variables
isFeasibleCheck for contradictions
duplicatedFind duplicated rules
blocksDecompose rules into independent blocks
disjunctDecouple conditional edits into disjunct edit sets
separateDecompose rules in blocks and decouple conditinal edits
generateEditsGenerate all nonredundant implicit edits (editarray only)

V. Plot and coerce edits

editrules provides several methods for plotting and coercion.

editrules.plottingPlot edit-variable connectivity graph
as.igraphCoerce to edit-variable connectivity igraph object
as.characterCoerce edits to character representation
as.data.frameStore character representation in data.frame

See Also