rm_time: Remove/Replace/Extract Time

Description

rm_time - Remove/replace/extract time from a string.

rm_transcript_time - Remove/replace/extract transcript specific time stamps from a string.

as_time - Convert a time stamp removed by rm_time or rm_transcript_time to a standard time format (HH:SS:MM.OS) and optionally convert to as.POSIXlt.

as_time - A convenience function for as_time that unlists and returns a vector rather than a list.

Usage

rm_time(
  text.var,
  trim = !extract,
  clean = TRUE,
  pattern = "@rm_time",
  replacement = "",
  extract = FALSE,
  dictionary = getOption("regex.library"),
  ...
)
rm_transcript_time(
  text.var,
  trim = !extract,
  clean = TRUE,
  pattern = "@rm_transcript_time",
  replacement = "",
  extract = FALSE,
  dictionary = getOption("regex.library"),
  ...
)
as_time(x, as.POSIXlt = FALSE, millisecond = TRUE)
as_time2(x, ...)
ex_time(
  text.var,
  trim = !extract,
  clean = TRUE,
  pattern = "@rm_time",
  replacement = "",
  extract = TRUE,
  dictionary = getOption("regex.library"),
  ...
)
ex_transcript_time(
  text.var,
  trim = !extract,
  clean = TRUE,
  pattern = "@rm_transcript_time",
  replacement = "",
  extract = TRUE,
  dictionary = getOption("regex.library"),
  ...
)

Value

Returns a character string with time removed.

Arguments

text.var: The text variable.
trim: logical. If TRUE removes leading and trailing white spaces.
clean: trim logical. If TRUE extra white spaces and escaped character will be removed.
pattern: A character string containing a regular expression (or character string for fixed = TRUE) to be matched in the given character vector (see Details for additional information). Default, @rm_time uses the rm_time regex from the regular expression dictionary from the dictionary argument.
replacement: Replacement for matched pattern.
extract: logical. If TRUE the times are extracted into a list of vectors.
dictionary: A dictionary of canned regular expressions to search within if pattern begins with "@rm_".
...: Other arguments passed to gsub.
x: A list with extracted time stamps.
as.POSIXlt: logical. If TRUE the output will be converted to as.POSIXlt.
millisecond: logical. If TRUE milliseconds are retained. If FALSE they are rounded and added to seconds.

Author

stackoverflow's hwnd and Tyler Rinker <tyler.rinker@gmail.com>.

Details

The default regular expression used by rm_time finds time with no AM/PM. This behavior can be altered by using a secondary regular expression from the regex_usa data (or other dictionary) via (pattern = "@rm_time2". See Examples for example usage.

References

The time regular expression was taken from: https://stackoverflow.com/a/25111133/1000343

Examples

Run this code

x <-  c("R uses 1:5 for 1, 2, 3, 4, 5.", 
    "At 3:00 we'll meet up and leave by 4:30:20",
    "We'll meet at 6:33.", "He ran it in :22.34")

rm_time(x)
ex_time(x)

## With AM/PM
x <- c(
    "I'm getting 3:04 AM just fine, but...",
    "for 10:47 AM I'm getting 0:47 AM instead.",
    "no time here",
    "Some time has 12:04 with no AM/PM after it",
    "Some time has 12:04 a.m. or the form 1:22 pm"
)

ex_time(x)
ex_time(x, pat="@rm_time2")
rm_time(x, pat="@rm_time2")
ex_time(x, pat=pastex("@rm_time2", "@rm_time"))

# Convert to standard format
as_time(ex_time(x))
as_time(ex_time(x), as.POSIXlt = TRUE)
as_time(ex_time(x), as.POSIXlt = FALSE, millisecond = FALSE) 

# Transcript specific time stamps
x2 <-c(
    '08:15 8 minutes and 15 seconds	00:08:15.0',
    '3:15 3 minutes and 15 seconds	not 1:03:15.0',
    '01:22:30 1 hour 22 minutes and 30 seconds	01:22:30.0',
    '#00:09:33-5# 9 minutes and 33.5 seconds	00:09:33.5',
    '00:09.33,75 9 minutes and 33.5 seconds	00:09:33.75'
)

rm_transcript_time(x2)
(out <- ex_transcript_time(x2))

as_time(out)
as_time(out, TRUE)
as_time(out, millisecond = FALSE)

if (FALSE) {
if (!require("pacman")) install.packages("pacman")
pacman::p_load(chron)
lapply(as_time(out), chron::times)
lapply(as_time(out, , FALSE), chron::times)
}

Run the code above in your browser using DataLab