Learn R Programming

rcdk (version 3.4.7.1)

get.fingerprint: Evaluate Fingerprints

Description

This function evaluates fingerprints of a specified type for a set of molecules or a single molecule. Depending on the nature of the fingerprint, parameters can be specified. Currently five different fingerprints can be specified:

  • standard - Considers paths of a given length. The default is but can be changed. These are hashed fingerprints, with a default length of 1024

  • extended - Similar to the standard type, but takes rings and atomic properties into account into account

  • graph - Similar to the standard type by simply considers connectivity

  • hybridization - Similar to the standard type, but only consider hybridization state

  • maccs - The popular 166 bit MACCS keys described by MDL

  • estate - 79 bit fingerprints corresponding to the E-State atom types described by Hall and Kier

  • pubchem - 881 bit fingerprints defined by PubChem

  • kr - 4860 bit fingerprint defined by Klekota and Roth

  • shortestpath - A fingerprint based on the shortest paths between pairs of atoms and takes into account ring systems, charges etc.

  • signature - A feature,count type of fingerprint, similar in nature to circular fingerprints, but based on the signature descriptor

  • circular - An implementation of the ECFP6 fingerprint

Depending on whether the input is a single IAtomContainer object, a list or single vector is returned. Each element of the list is an S4 object of class fingerprint-class or featvec-class, which can be manipulated with the fingerprint package.

Usage

get.fingerprint(molecule, type = 'standard', 
                    fp.mode = 'bit', depth=6, size=1024, verbose=FALSE)

Arguments

molecule

An IAtomContainer object that can be obtained by loading them from disk or drawing them in the editor.

type

The type of fingerprint. See description for possible values. The default is the standard binary fingerprint.

fp.mode

The type of fingerprint to return. Possible values are 'bit', 'raw', and 'count'. The 'raw' mode will return a featvec-class type of fingerprint, representing fragments and their count of occurence in the molecule. The 'count' mode is similar, except that it returns hash values of fragments and their count of occurence. While any of these values can be specified, a given fingerprint implementation may not implement all of them, and in those cases the return value is NULL.

depth

The search depth. This argument is ignored for the 'pubchem', 'maccs', 'kr' and 'estate' fingerprints

size

The length of the fingerprint bit string. This argument is ignored for the 'pubchem', 'maccs', 'kr', 'signature', 'circular' and 'estate' fingerprints

verbose

If TRUE, exceptions, if they occur, will be printed

Value

Objects of class fingerprint-class or featvec-class, from the fingerprint package. If there is a problem during fingerprint calculation, NULL is returned.

References

Faulon et al, The Signature Molecular Descriptor. 1. Using Extended Valence Sequences in QSAR and QSPR studies, J. Chem. Inf. Comput. Sci., 2003, 43, 707-720.

See Also

load.molecules

Examples

Run this code
# NOT RUN {
## get some molecules
sp <- get.smiles.parser()
smiles <- c('CCC', 'CCN', 'CCN(C)(C)', 'c1ccccc1Cc1ccccc1','C1CCC1CC(CN(C)(C))CC(=O)CC')
mols <- parse.smiles(smiles)

## get a single fingerprint using the standard
## (hashed, path based) fingerprinter
fp <- get.fingerprint(mols[[1]])

## get MACCS keys for all the molecules
fps <- lapply(mols, get.fingerprint, type='maccs')

## get Signature fingerprint
## feature, count fingerprinter
fps <- lapply(mols, get.fingerprint, type='signature', fp.mode='raw')
# }

Run the code above in your browser using DataLab