Learn R Programming

RVerbalExpressions

The goal of RVerbalExpressions is to make it easier to construct regular expressions using grammar and functionality inspired by VerbalExpressions. Usage of %>% is encouraged to build expressions in a chain like fashion.

Installation

Install the released version of RVerbalExpressions from CRAN:

install.packages("RVerbalExpressions")

Or install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("VerbalExpressions/RVerbalExpressions")

Example

This is a basic example which shows you how to build a regular expression:

library(RVerbalExpressions)

# construct an expression
x <- rx_start_of_line() %>% 
  rx_find('http') %>% 
  rx_maybe('s') %>% 
  rx_find('://') %>% 
  rx_maybe('www.') %>% 
  rx_anything_but(' ') %>% 
  rx_end_of_line()

# print the expression
x
#> [1] "^(http)(s)?(\\://)(www\\.)?([^ ]*)$"

# test for a match
grepl(x, "https://www.google.com")
#> [1] TRUE

Other Implementations

You can see an up to date list of all ports on VerbalExpressions.github.io.

Additionally, there are two R packages that try to solve the same problem. I encourage you to check these out:

  1. rex by @kevinushey
  2. rebus by @richierocks

Contributing

If you find any issues, typos, etc., please file an issue or submit a PR. All contributions are welcome!

Copy Link

Version

Install

install.packages('RVerbalExpressions')

Monthly Downloads

163

Version

0.1.1

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Last Published

March 20th, 2024

Functions in RVerbalExpressions (0.1.1)

rx_word

Match a word.
rx_either_of

Alternatively, match either expression.
rx_line_break

Match a line break.
rx_something_but

Match any character(s) except these at least once.
rx_tab

Match a tab character.
rx_uppercase

Match upper case letters.
rx_something

Match any character(s) at least once.
rx_word_edge

Find beginning or end of a word.
rx_lowercase

Match lower case letters.
sanitize

Escape characters expected special by regex engines
rx_end_capture

End a capture group.
rx_punctuation

Match punctuation characters.
rx_one_or_more

Match the previous stuff one or more times.
rx_word_char

Match a word character.
rx_range

Match any character within the range defined by the parameters.
rx_with_any_case

Control case-insensitive matching.
rx_whitespace

Match a whitespace character.
rx_seek_prefix

Positive lookaround functions
rx_alnum

Match alphanumeric characters.
rx_anything_but

Match any character(s) except these any (including zero) number of times.
%>%

Pipe operator
rx_begin_capture

Begin a capture group.
rx_alpha

Match alphabetic characters.
rx_digit

Match a digit (0–9).
rx_avoid_prefix

Negative lookaround functions
rx_anything

Match any character(s) any (including zero) number of times.
rx_any_of

Match any of these characters exactly once.
rx

Constructs a Verbal Expression
rx_end_of_line

Match the expression only if it appears till the end of the line.
rx_maybe

Optionally match an expression.
rx_none_or_more

Match the previous stuff zero or many times.
rx_not

Ensure that the parameter does not follow.
rx_find

Match an expression.
rx_multiple

Match the previous group any number of times.
rx_space

Match a space character.
rx_start_of_line

Match the expression only if it appears from beginning of line.