Learn R Programming

qdapRegex (version 0.7.8)

rm_number: Remove/Replace/Extract Numbers

Description

rm_number - Remove/replace/extract number from a string (works on numbers with commas, decimals and negatives).

as_numeric - A wrapper for as.numeric(gsub(",", "", x)), which removes commas and converts a list of vectors of strings to numeric. If the string cannot be converted to numeric NA is returned.

as_numeric2 - A convenience function for as_numeric that unlists and returns a vector rather than a list.

Usage

rm_number(
  text.var,
  trim = !extract,
  clean = TRUE,
  pattern = "@rm_number",
  replacement = "",
  extract = FALSE,
  dictionary = getOption("regex.library"),
  ...
)

as_numeric(x)

as_numeric2(x)

ex_number( text.var, trim = !extract, clean = TRUE, pattern = "@rm_number", replacement = "", extract = TRUE, dictionary = getOption("regex.library"), ... )

Value

rm_number - Returns a character string with number removed.

as_numeric - Returns a list of vectors of numbers.

as_numeric2 - Returns an unlisted vector of numbers.

Arguments

text.var

The text variable.

trim

logical. If TRUE removes leading and trailing white spaces.

clean

trim logical. If TRUE extra white spaces and escaped character will be removed.

pattern

A character string containing a regular expression (or character string for fixed = TRUE) to be matched in the given character vector. Default, @rm_number uses the rm_number regex from the regular expression dictionary from the dictionary argument.

replacement

Replacement for matched pattern.

extract

logical. If TRUE the numbers are extracted into a list of vectors.

dictionary

A dictionary of canned regular expressions to search within if pattern begins with "@rm_".

...

Other arguments passed to gsub.

x

a character vector to convert to a numeric vector.

References

The number regular expression was created by Jason Gray.

See Also

gsub, stri_extract_all_regex

Other rm_ functions: rm_abbreviation(), rm_between(), rm_bracket(), rm_caps_phrase(), rm_caps(), rm_citation_tex(), rm_citation(), rm_city_state_zip(), rm_city_state(), rm_date(), rm_default(), rm_dollar(), rm_email(), rm_emoticon(), rm_endmark(), rm_hash(), rm_nchar_words(), rm_non_ascii(), rm_non_words(), rm_percent(), rm_phone(), rm_postal_code(), rm_repeated_characters(), rm_repeated_phrases(), rm_repeated_words(), rm_tag(), rm_time(), rm_title_name(), rm_url(), rm_white(), rm_zip()

Examples

Run this code
x <- c("-2 is an integer.  -4.3 and 3.33 are not.",
    "123,456 is 0 alot -123456 more than -.2", "and 3456789123 fg for 345.",
    "fg 12,345 23 .44 or 18.", "don't remove this 444,44", "hello world -.q")

rm_number(x)
ex_number(x)

##Convert to numeric
as_numeric(ex_number(x))   # retain list
as_numeric2(ex_number(x))  # unlist

Run the code above in your browser using DataLab