Learn R Programming

cleaver (version 1.10.2)

cleave: Cleavage of polypeptide sequences

Description

This functions cleave polypeptide sequences. Use cleavageSites to find the cleavage sites, cleavageRanges to find the cleavage ranges and cleave to get the cleavage products.

Usage

"cleave"(x, enzym = "trypsin", missedCleavages = 0, custom = NULL, unique = TRUE)
"cleave"(x, enzym = "trypsin", missedCleavages = 0, custom = NULL, unique = TRUE)
"cleave"(x, enzym = "trypsin", missedCleavages = 0, custom = NULL, unique = TRUE)
"cleavageRanges"(x, enzym = "trypsin", missedCleavages = 0, custom = NULL)
"cleavageRanges"(x, enzym = "trypsin", missedCleavages = 0, custom = NULL)
"cleavageRanges"(x, enzym = "trypsin", missedCleavages = 0, custom = NULL)
"cleavageSites"(x, enzym = "trypsin", custom = NULL)
"cleavageSites"(x, enzym = "trypsin", custom = NULL)
"cleavageSites"(x, enzym = "trypsin", custom = NULL)

Arguments

x
polypeptide sequences.
enzym
character, cleavage rule.
missedCleavages
numeric, number of missed cleavages.
custom
character, of length 1 or 2. Could be used to define own cleaveage rules. The first element would be the pattern and the optional second element would be an exception (non-cleavage) pattern. Perl-like regular expressions are supported, see gregexpr for details. If custom is set the enzym is ignored.
unique
logical, if TRUE all duplicated cleavage products per peptide are removed.

Value

cleave
If x is a character it returns a list of the same length as x. Each element contains a character vector with the corresponding cleavage products of the polypeptides. If x is an AAString or an AAStringSet an AAStringSet or an AAStringSetList instance of the same length as x is returned. Each element contains an AAString or an AAStringSet instance with the corresponding cleavage products of the polypeptides.
cleavageRanges
If x is a character it returns a list of the same length as x. Each element contains a two-column matrix with the start and end positions of the peptides. If x is an AAString or an AAStringSet instance an IRanges or an IRangesList of the same length as x is returned.
cleavageSites
Returns a list of the same length as x. Each element contains an integer vector with the cleavage positions.
Overview:
Input cleave cleavageRanges
cleavageSites character list of character
list of matrix list of integer AAString
AAStringSet IRanges list of integer
AAStringSet AAStringSetList IRangesList
list of integer Input cleave

Details

The cleavage rules are taken from: http://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html

Cleavage rules (cleavage between P1 and P1'):

Rule name P4 P3 P2 P1 P1' P2'
arg-c proteinase - - - R - -
asp-n endopeptidase - - - - D -
bnps-skatole-c - - - W - -
caspase1 F,W,Y,L - H,A,T D not P,E,D,Q,K,R -
caspase2 D V A D not P,E,D,Q,K,R -
caspase3 D M Q D not P,E,D,Q,K,R -
caspase4 L E V D not P,E,D,Q,K,R -
caspase5 L,W E H D - -
caspase6 V E H,I D not P,E,D,Q,K,R -
caspase7 D E V D not P,E,D,Q,K,R -
caspase8 I,L E T D not P,E,D,Q,K,R -
caspase9 L E H D - -
caspase10 I E A D - -
chymotrypsin-high - - - F,Y not P -
- - - W not M,P -
chymotrypsin-low - - - F,L,Y not P -
- - - W not M,P -
- - - M not P,Y -
- - - H not D,M,P,W -
clostripain - - - R - -
cnbr - - - M - -
enterokinase D,E D,E D,E K - -
factor xa A,F,G,I,L,T,V,M D,E G R - -
formic acid - - - D - -
glutamyl endopeptidase - - - D - -
granzyme-b I E P D - -
hydroxylamine - - - N G -
iodosobenzoic acid - - - W - -
lysc - - - K - -
lysn - - - - K -
neutrophil elastase - - - A,V - -
ntcb - - - - C -
pepsin1.3 - not H,K,R not P not R F,L,W,Y not P
- not H,K,R not P F,L,W,Y - not P
pepsin - not H,K,R not P not R F,L not P
- not H,K,R not P F,L - not P
proline endopeptidase - - not H,K,R P not P -
proteinase k - - - A,E,F,I,L,T,V,W,Y - -
staphylococcal peptidase i - - not E E - -
thermolysin - - - not D,E A,F,I,L,M,V -
thrombin - - G R G -
A,F,G,I,L,T,V,M A,F,G,I,L,T,V,W P R not D,E not D,E
trypsin - - - K,R not P -
- - W K P -
- - M R P -

Exceptions:

Rule name Enzyme name P4 P3 P2 P1 P1'
P2' trypsin - - C,D K D
- - - C K
H,Y - - - C
R K - - -
R R H,R - Rule name Enzyme name P4

Rule name
Enzyme name
arg-c proteinase
Arg-C proteinase
asp-n endopeptidase
Asp-N endopeptidase
bnps-skatole-c
BNPS-Skatole
caspase1
Caspase 1
caspase2
Caspase 2
caspase3
Caspase 3
caspase4
Caspase 4
caspase5
Caspase 5
caspase6
Caspase 6
caspase7
Caspase 7
caspase8
Caspase 8
caspase9
Caspase 9
caspase10
Caspase 10
chymotrypsin-high
Chymotrypsin-high specificity (C-term to [FYW], not before P)
chymotrypsin-low
Chymotrypsin-low specificity (C-term to [FYWML], not before P)
clostripain
Clostripain (Clostridiopeptidase B)
cnbr
CNBr
enterokinase
Enterokinase
factor xa
Factor Xa
formic acid
Formic acid
glutamyl endopeptidase
Glutamyl endopeptidase
granzyme-b
Granzyme B
hydroxylamine
Hydroxylamine
iodosobenzoic acid
Iodosobenzoic acid
lysc
LysC
lysn
LysN
neutrophil elastase
Neutrophil elastase
ntcb
NTCB (2-nitro-5-thiocyanobenzoic acid)
pepsin1.3
Pepsin (pH == 1.3)
pepsin
Pepsin (pH > 2)
proline endopeptidase
Proline-endopeptidase
proteinase k
Proteinase K
staphylococcal peptidase i
Staphylococcal Peptidase I
thermolysin
Thermolysin
thrombin
Thrombin
trypsin
Trypsin

References

Gasteiger E., Hoogland C., Gattiker A., Duvaud S., Wilkins M.R., Appel R.D., Bairoch A.; "Protein Identification and Analysis Tools on the ExPASy Server". (In) John M. Walker (ed): The Proteomics Protocols Handbook, Humana Press (2005). http://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html

See Also

AAString, AAStringSet, AAStringSetList, IRanges, IRangesList

Examples

Run this code
library("cleaver")

## Gastric juice peptide 1 (UniProtKB/Swiss-Prot: GAJU_HUMAN/P01358)
gaju <- "LAAGKVEDSD"

cleave(gaju, "trypsin")
# $LAAGKVEDSD
# [1] "LAAGK" "VEDSD"

cleavageRanges(gaju, "trypsin")
# $LAAGKVEDSD
#      start end
# [1,]     1   5
# [2,]     6  10

cleavageSites(gaju, "trypsin")
# $LAAGKVEDSD
# [1] 5

cleave(gaju, "trypsin", missedCleavages=1)
# $LAAGKVEDSD
# [1] "LAAGKVEDSD"

cleavageRanges(gaju, "trypsin", missedCleavages=1)
# $LAAGKVEDSD
#      start end
# [1,]     1  10

cleave(gaju, "trypsin", missedCleavages=0:1)
# $LAAGKVEDSD
# [1] "LAAGK" "VEDSD" "LAAGKVEDSD"

cleavageRanges(gaju, "trypsin", missedCleavages=0:1)
# $LAAGKVEDSD
#      start end
# [1,]     1   5
# [2,]     6  10
# [3,]     1  10


cleave(gaju, "pepsin")
# $LAAGKVEDSD
# [1] "LAAGKVEDSD"
# (no cleavage)


## use AAStringSet
gaju <- AAStringSet("LAAGKVEDSD")

cleave(gaju)
# AAStringSetList of length 1
# [["LAAGKVEDSD"]] LAAGK VEDSD


## Beta-enolase (UniProtKB/Swiss-Prot: ENOB_THUAL/P86978)
enob <- "SITKIKAREILD"

cleave(enob, "trypsin")
# $SITKIKAREILD
# [1] "SITK" "IK"   "AR"   "EILD"

cleave(enob, "trypsin", missedCleavages=2)
# $SITKIKAREILD
# [1] "SITKIKAR" "IKAREILD"

cleave(enob, "trypsin", missedCleavages=0:2)
# $SITKIKAREILD
# [1] "SITK"     "IK"       "AR"       "EILD"     "SITKIK"   "IKAR"
# [7] "AREILD"   "SITKIKAR" "IKAREILD"

## define own cleavage rule: cleave at K
cleave(enob, custom="K")
# $SITKIKAREILD
# [1] "SITK"   "IK"     "AREILD"

cleavageRanges(enob, custom="K")
# $SITKIKAREILD
#      start end
# [1,]     1   4
# [2,]     5   6
# [3,]     7  12

## define own cleavage rule: cleave at K but not if followed by A
cleave(enob, custom=c("K", "K(?=A)"))
# $SITKIKAREILD
# [1] "SITK"     "IKAREILD"

cleavageRanges(enob, custom=c("K", "K(?=A)"))
# $SITKIKAREILD
#      start end
# [1,]     1   4
# [2,]     5  12

cleavageSites(enob, custom=c("K", "K(?=A)"))
# $SITKIKAREILD
# [1] 4

Run the code above in your browser using DataLab