Learn R Programming

protr (version 1.7-4)

removeGaps: Remove or replace gaps from protein sequences.

Description

Remove/replace gaps or any irregular characters from protein sequences, to make them suitable for feature extraction or sequence alignment based similarity computation.

Usage

removeGaps(x, pattern = "-", replacement = "", ...)

Value

a vector of protein sequence(s) with gaps or irregular characters removed/replaced.

Arguments

x

character vector, containing the input protein sequence(s).

pattern

character string contains the gap (or other irregular) character to be removed or replaced. Default is "-". For advanced usage, see gsub.

replacement

a replacement for matched characters. Default is "" (remove the matched character).

...

addtional parameters for gsub.

Author

Nan Xiao <https://nanx.me>

Examples

Run this code
# amino acid sequences that contain gaps ("-")
aaseq <- list(
  "MHGDTPTLHEYMLDLQPETTDLYCYEQLSDSSE-EEDEIDGPAGQAEPDRAHYNIVTFCCKCDSTLRLCVQS",
  "MHGDTPTLHEYMLDLQPETTDLYCYEQLNDSSE-EEDEIDGPAGQAEPDRAHYNIVTFCCKCDSTLRLCVQS"
)
if (FALSE) {
#' # gaps create issues for alignment
parSeqSim(aaseq)

# remove the gaps
nogapseq <- removeGaps(aaseq)
parSeqSim(nogapseq)
}

Run the code above in your browser using DataLab