PredCRG_Enc: Encoding of protein sequence data in to numeric feature vector based on PredCRG features.
Description
Before using the protein sequences for prediction using the proposed model, the sequences must be transformed into numeric feature vectors. The function PredCRG_Enc will transform each protein sequnces to a numeric vector of 62 observations, based on the compositional, physico-chemical and transitional features used in the PredCRG model.
Usage
PredCRG_Enc(prot_seq)
Arguments
prot_seq
Sequence dataset to be supplied as input, must be an object of class AAStringSet
Value
A matrix of dimension n*62, for n number of sequences.
Details
The dataset must contains the protein sequences having standard amino acid residues only. The clas AAStringSet can be obtained by reading the FASTA file using readAAStringSet available in bioconductor package Biostrings.