Learn R Programming

QSARdata (version 1.3)

caco: Caco-2 Permeability Data

Description

These data were compiled and described by Pham-The et al. (2013). The data set consists compounds that were designated as high, medium or low permeability. The structures and outcomes were obtained from the supporting information at http://doi.wiley.com/10.1002/minf.201200166. These data are from Table SI1 and Table SI4. Some compounds failed in descriptor calculations so the total sample size here is 3796 compounds.

The package contains none sets of molecular descriptors: atom pair distances, Dragon descriptors (http://www.talete.mi.it/products/dragon_plus.htm), PipelinePilot fingerprints (http://accelrys.com/products/pipeline-pilot/) and QuickProp descriptors.

For fingerprints, the 1000 most variable bits were selected whenever possible.

Usage

data(caco)

Arguments

Format

The data consist of several data frames. The first column of the descriptor data frames is called "Molecule" representing the compounds. The original identifiers were chewed-up during the descriptor calculations and have been give unique but arbitrary values to merge across descriptor sets.
caco_AtomPair
Atom pair descriptors
caco_Dragon
Dragon descriptors (http://www.talete.mi.it/products/dragon_plus.htm)
caco_PipelinePilot_FP
PipelinePilot fingerprints (http://accelrys.com/products/pipeline-pilot/)
caco_QuickProp
QuickProp descriptors
caco_Outcome
a data frame with columns for the molecule name and the outcome (for merging)

References

Pham-The, H., Gonzalez-Alvarez, I., Bermejo, M., Garrigues, T., Le-Thi-Thu, H., & Cabrera-Perez, M. A. (2013). The Use of Rule-Based and QSPR Approaches in ADME Profiling: A Case Study on Caco-2 Permeability. Molecular Informatics.

Examples

Run this code
data(caco)
head(caco_Outcome)

Run the code above in your browser using DataLab