Learn R Programming

Biostrings (version 2.40.2)

matchprobes: A function to match a query sequence to the sequences of a set of probes.

Description

The query sequence, a character string (probably representing a transcript of interest), is scanned for the presence of exact matches to the sequences in the character vector records. The indices of the set of matches are returned.

The function is inefficient: it works on R's character vectors, and the actual matching algorithm is of time complexity length(query) times length(records)!

See matchPattern, vmatchPattern and matchPDict for more efficient sequence matching functions.

Usage

matchprobes(query, records, probepos=FALSE)

Arguments

query
A character vector. For example, each element may represent a gene (transcript) of interest. See Details.
records
A character vector. For example, each element may represent the probes on a DNA array.
probepos
A logical value. If TRUE, return also the start positions of the matches in the query sequence.

Value

A list. Its first element is a list of the same length as the input vector. Each element of the list is a numeric vector containing the indices of the probes that have a perfect match in the query sequence.If probepos is TRUE, the returned list has a second element: it is of the same shape as described above, and gives the respective positions of the matches.

Details

toupper is applied to the arguments query and records before matching. The intention of this is to make the matching case-insensitive. The function is embarrassingly naive. The matching is done using the C library function strstr.

See Also

matchPattern, vmatchPattern, matchPDict

Examples

Run this code
  if(require("hgu95av2probe")){
    data("hgu95av2probe")
    seq <- hgu95av2probe$sequence[1:20]
    target <- paste(seq, collapse="")
    matchprobes(target, seq, probepos=TRUE)
  }

Run the code above in your browser using DataLab