Learn R Programming

adagio (version 0.9.2)

occurs: Finding Subsequences

Description

Counts items, or finds subsequences of (integer) sequences.

Usage

count(x, sorted = TRUE)

occurs(subseq, series)

Value

count returns a list with components v the items and

e the number of times it apears in the array.

occurs returns a vector of indices, the positions where the subsequence appears in the series.

Arguments

x

array of items, i.e. numbers or characters.

sorted

logical; default is to sort items beforehand.

subseq

vector of integers.

series

vector of integers.

Details

count counts the items, similar to table, but as fast and a more tractable output. If sorted then the total number per item will be counted, else per repetition.

If m and n are the lengths of s and S resp., then occurs(s, S) determines all positions i such that s == S[i, ..., i+m-1].

The code is vectorized and relatively fast. It is intended to complement this with an implementation of Rabin-Karp, and possibly Knuth-Morris-Pratt and Boyer-Moore algorithms.

Examples

Run this code
##  Examples
patrn <- c(1,2,3,4)
exmpl <- c(3,3,4,2,3,1,2,3,4,8,8,23,1,2,3,4,4,34,4,3,2,1,1,2,3,4)
occurs(patrn, exmpl)
## [1]  6 13 23

if (FALSE) {
set.seed(2437)
p <- sample(1:20, 1000000, replace=TRUE)
system.time(i <- occurs(c(1,2,3,4,5), p))  #=>  [1] 799536
##  user  system elapsed 
## 0.017   0.000   0.017 [sec]

system.time(c <- count(p))
##  user  system elapsed 
## 0.075   0.000   0.076 
print(c)
## $v
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
## $e
##  [1] 49904 50216 49913 50154 49967 50045 49747 49883 49851 49893
## [11] 50193 50024 49946 49828 50319 50279 50019 49990 49839 49990
}

Run the code above in your browser using DataLab