hmmer: HMMER Sequence Search

Description

Perform a HMMER search against the PDB, NR, swissprot or other sequence and structure databases.

Usage

hmmer(seq, type="phmmer", db = NULL, verbose = TRUE, timeout = 90)

Arguments

seq

a multi-element character vector containing the query sequence. Alternatively a fasta object as obtained from functions get.seq or read.fasta can be provided.

type

character string specifying the HMMER job type. Current options are phmmer, hmmscan, hmmsearch, and jackhmmer.

character string specifying the database to search. Current options are pdb, nr, swissprot, pfam, etc. See details for a complete list.

verbose

logical, if TRUE details of the download process is printed.

timeout

integer specifying the number of seconds to wait for the blast reply before a time out occurs.

Value

A data frame with multiple components depending on the selected job type. Frequently reported fields include:
namea character vector containg the name of the target.
acca character vector containg the accession identifier of the target.
acc2a character vector containg secondary accession of the target.
ida character vector containg Identifier of the target
desca character vector containg entry description.
scorea numeric vector containg bit score of the sequence (all domains, without correction).
pvaluea numeric vector containg the P-value of the score.
evaluea numeric vector containg the E-value of the score.
nregionsa numeric vector containg Number of regions evaluated.
nenvelopesa numeric vector containg the number of envelopes handed over for domain definition, null2, alignment, and scoring.
ndoma numeric vector containg the total number of domains identified in this sequence.
nreporteda numeric vector containg the number of domains satisfying reporting thresholding.
nincludeda numeric vector containg the number of domains satisfying inclusion thresholding.
taxida character vector containg The NCBI taxonomy identifier of the target (if applicable).
speciesa character vector containg the species name.
kga character vector containg the kingdom of life that the target belongs to - based on placing in the NCBI taxonomy tree.
More details can be found at the HMMER website: http://www.ebi.ac.uk/Tools/hmmer/help/api

Details

This function employs direct HTTP-encoded requests to the HMMER web server. HMMER can be used to search sequence databases for homolog protein sequences. The HMMER server implements methods using probabilistic models called profile hidden Markov models (profile HMMs).

There are currently four types of HMMER search to perform: - phmmer: protein sequence vs protein sequence database. (input argument seq must be a sequence). Allowed options for type includes: env_nr, nr, refseq, pdb, rp15, rp35, rp55, rp75, swissprot, unimes, uniprotkb, uniprotrefprot, pfamseq.

- hmmscan: protein sequence vs profile-HMM database. (input argument seq must be a sequence). Allowed options for type includes: pfam, gene3d, superfamily, tigrfam.

- hmmsearch: protein alignment/profile-HMM vs protein sequence database. (input argument seq must be an alignment). Allowed options for type includes: pdb, swissprot.

- jackhmmer: iterative search vs protein sequence database. (input argument seq must be an alignment). jackhmmer functionality incomplete!! Allowed options for type includes: env_nr, nr, refseq, pdb, rp15, rp35, rp55, rp75, swissprot, unimes, uniprotkb, uniprotrefprot, pfamseq. More information can be found at the HMMER website: http://hmmer.janelia.org

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.

Finn, R.D. et al. (2011) Nucl. Acids Res. 39, 29--37. Eddy, S.R. (2011) PLoS Comput Biol 7(10): e1002195. See also the HMMER website: http://hmmer.janelia.org

Examples

Run this code

# HMMER server connection required - testing excluded

##- PHMMER
seq <- get.seq("2abl_A", outfile=tempfile())
res <- hmmer(seq, db="pdb")

##- HMMSCAN
fam <- hmmer(seq, type="hmmscan", db="pfam")
pfam.aln <- pfam(fam$acc[1])

##- HMMSEARCH
hmm <- hmmer(pfam.aln, type="hmmsearch", db="pdb")
unique(hmm$species)
hmm$acc

Run the code above in your browser using DataLab