cmp.parse1: Parsing an SDF file and calculate the descriptor for one compound
Description
Read SDF information from an SDF file or connection, parse the first
compound, and calculate the descriptor for that compound. The returned
descriptor can be added to database returned by 'cmp.parse' or be used
as the query structure when calling 'search'. This function will only
parse one compound and return only the descriptor. To parse all
compounds in an SDF file, use 'cmp.parse'.
Usage
cmp.parse1(filename)
Arguments
filename
The file name of the SDF file or a URL or a connection.
Value
Return the descriptor, which is encoded as a vector.
Details
'cmp.parse1' can take a file name or a URL or a
connection. When a connection is used, the current line must be the
first line of SDF of the compound to be parsed. 'cmp.parse1' will skip
the header and parse from the 4th line.
Therefore, the compound ID information will be skipped. After the
parsing is done, if 'filename' is a connection, it will then point to
the line after the connection table of SDF. You can use some other
procedure to parse the annotation block.
References
Chen X and Reynolds CH (2002). "Performance of similarity measures
in 2D fragment-based similarity searching: comparison of structural descriptors
and similarity coefficients", in J Chem Inf Comput Sci.
# load an SDF file from web and parse it## Not run: structure <- cmp.parse1("http://bioweb.ucr.edu/ChemMineV2/compound/Aurora/b32:NNQS2MBRHAZTI===/sdf")