A number of distance metrics can be calculated for binary fingerprints. Some of these are actually similarity metrics and thus represent the reverse of a distance metric.
The following are distance (dissimilarity) metrics
Hamming
Mean Hamming
Soergel
Pattern Difference
Variance
Size
Shape
The following metrics are similarity metrics and so the distance can be obtained by subtracting the value fom 1.0
Tanimoto
Dice
Modified Tanimoto
Simple
Jaccard
Russel-Rao
Rodgers Tanimoto
Cosine
Achiai
Carbo
Baroniurbanibuser
Kulczynski2
Robust
Finally the method also provides a set of composite and asymmetric distance metrics
Hamann
Yule
Pearson
Dispersion
McConnaughey
Stiles
Simpson
Petke
Tversky
The default metric is the Tanimoto coefficient.
distance(fp1, fp2, method, a, b)
An object of class fingerprint
or featvec
An object of class fingerprint
or featvec
Parameter for the Tversky index
Parameter for the Tversky index
The type of distance metric desired. Partial matching is
supported and the deault is tanimoto
. Alternative values are
euclidean
hamming
meanHamming
soergel
patternDifference
variance
size
shape
jaccard
dice
mt
simple
russelrao
rodgerstanimoto
cosine
achiai
carbo
baroniurbanibuser
kulczynski2
robust
hamann
yule
pearson
mcconnaughey
stiles
simpson
petke
tversky
If the two fingerprints are of class featvec
then the following methods
may be specified: tanimoto
, robust
and dice
.
Numeric value representing the distance in the specified metric between the supplied fingerprint objects
signature(fp1 = "featvec", fp2 = "featvec", method = "character", a = "missing", b = "missing")
Similarity method for feature vector type fingerprints, supporting tanimoto
, robust
and dice
metrics.
signature(fp1 = "featvec", fp2 = "featvec", method = "missing", a = "missing", b = "missing")
Evaluate Tanimoto similarity between two feature vector fingerprints
signature(fp1 = "fingerprint", fp2 = "fingerprint", method = "character", a = "missing", b = "missing")
Evaluate similarity (or dissimilrity) between two binary fingerprints. See below for a list of possible similarity (or dissimilarity) metrics
signature(fp1 = "fingerprint", fp2 = "fingerprint", method = "character", a = "numeric", b = "numeric")
Evaluate Tversky similarity between two binary fingerprints.
signature(fp1 = "fingerprint", fp2 = "fingerprint", method = "missing", a = "missing", b = "missing")
Evaluate Tanimoto similarity between two binary fingerprints
Fligner, M.A.; Verducci, J.S.; Blower, P.E.; A Modification of the Jaccard-Tanimoto Similarity Index for Diverse Selection of Chemical Compounds Using Binary Strings, Technometrics, 2002, 44(2), 110-119
Monve, V.; Introduction to Similarity Searching in Chemistry, MATCH - Comm. Math. Comp. Chem., 2004, 51, 7-38
# NOT RUN {
# make a 2 fingerprint vectors
fp1 <- new("fingerprint", nbit=6, bits=c(1,2,5,6))
fp2 <- new("fingerprint", nbit=6, bits=c(1,2,5,6))
# calculate the tanimoto coefficient
distance(fp1,fp2) # should be 1
# Invert the second fingerprint
fp3 <- !fp2
distance(fp1,fp3) # should be 0
# }
Run the code above in your browser using DataLab