Flash Sale | 50% off
Get 50% off unlimited learning

httk (version 2.5.0)

get_cheminfo: Retrieve chemical information available from HTTK package

Description

This function lists information on all the chemicals within HTTK for which there are sufficient data for the specified model and species. By default the function returns only CAS (that is, info="CAS"). The type of information available includes chemical identifiers ("Compound", "CAS", "DTXSID"), in vitro measurements ("Clint", "Clint.pvalue", "Funbound plasma", "Rblood2plasma"), and physico-chemical information ("Formula", "logMA", "logP", "MW", "pKa_Accept", "pKa_Donor"). The argument "info" can be a single type of information, "all" information, or a vector of specific types of information. The argument "model" defaults to "3compartmentss" and the argument "species" defaults to "human". Since different models have different requirements and not all chemicals have complete data, this function will return different numbers of chemicals depending on the model specified. If a chemical is not listed by get_cheminfo then either the in vitro or physico-chemical data needed are currently missing (but could potentially be added using add_chemtable.

Usage

get_cheminfo(
  info = "CAS",
  species = "Human",
  fup.lod.default = 0.005,
  model = "3compartmentss",
  default.to.human = FALSE,
  median.only = FALSE,
  fup.ci.cutoff = TRUE,
  clint.pvalue.threshold = 0.05,
  physchem.exclude = TRUE,
  class.exclude = TRUE,
  suppress.messages = FALSE
)

Value

vector/data.table

Table (if info has multiple entries) or vector containing a column for each valid entry specified in the argument "info" and a row for each chemical with sufficient data for the model specified by argument "model":

ColumnDescriptionunits
CompoundThe preferred name of the chemical compoundnone
CASThe preferred Chemical Abstracts Service Registry Numbernone
DTXSIDDSSTox Structure ID (https://comptox.epa.gov/dashboard)none
logPThe log10 octanol:water partition coefficientlog10 unitless ratio
MWThe chemical compound molecular weightg/mol
pKa_AcceptThe hydrogen acceptor equilibria concentrationslogarithm
pKa_DonorThe hydrogen donor equilibria concentrationslogarithm
[SPECIES].Clint(Primary hepatocyte suspension) intrinsic hepatic clearance. Entries with comma separated values are Bayesian estimates of the Clint distribution - displayed as the median, 95th credible interval (that is quantile 2.5 and 97.5, respectively), and p-value.uL/min/10^6 hepatocytes
[SPECIES].Clint.pValueProbability that there is no clearance observed. Values close to 1 indicate clearance is not statistically significant.none
[SPECIES].Funbound.plasmaChemical fraction unbound in presence of plasma proteins (fup). Entries with comma separated values are Bayesian estimates of the fup distribution - displayed as the median and 95th credible interval (that is quantile 2.5 and 97.5, respectively).unitless fraction
[SPECIES].Rblood2plasmaChemical concentration blood to plasma ratiounitless ratio

Arguments

info

A single character vector (or collection of character vectors) from "Compound", "CAS", "DTXSID, "logP", "pKa_Donor"," pKa_Accept", "MW", "Clint", "Clint.pValue", "Funbound.plasma","Structure_Formula", or "Substance_Type". info="all" gives all information for the model and species.

species

Species desired (either "Rat", "Rabbit", "Dog", "Mouse", or default "Human").

fup.lod.default

Default value used for fraction of unbound plasma for chemicals where measured value was below the limit of detection. Default value is 0.0005.

model

Model used in calculation, 'pbtk' for the multiple compartment model, '1compartment' for the one compartment model, '3compartment' for three compartment model, '3compartmentss' for the three compartment model without partition coefficients, or 'schmitt' for chemicals with logP and fraction unbound (used in predict_partitioning_schmitt).

default.to.human

Substitutes missing values with human values if true.

median.only

Use median values only for fup and clint. Default is FALSE.

fup.ci.cutoff

Boolean eliminating uncertain fup estimates. If TRUE, fup values whose 95 spans 0.1 to 0.9 (or more) are eliminated. (Default value is TRUE.)

clint.pvalue.threshold

Hepatic clearance for chemicals where the in vitro clearance assay result has a p-values greater than the threshold are set to zero.

physchem.exclude

Exclude chemicals on the basis of physico-chemical properties (currently only Henry's law constant) as specified by the relevant modelinfo_[MODEL] file (default TRUE).

class.exclude

Exclude chemical classes identified as outside of domain of applicability by the relevant modelinfo_[MODEL] file (default TRUE).

suppress.messages

Whether or not the output messages are suppressed (default FALSE).

Author

John Wambaugh, Robert Pearce, and Sarah E. Davidson

Details

When default.to.human is set to TRUE, and the species-specific data, Funbound.plasma and Clint, are missing from chem.physical_and_invitro.data, human values are given instead.

In some cases the rapid equilibrium dialysis method (Waters et al., 2008) fails to yield detectable concentrations for the free fraction of chemical. In those cases we assume the compound is highly bound (that is, Fup approaches zero). For some calculations (for example, steady-state plasma concentration) there is precedent (Rotroff et al., 2010) for using half the average limit of detection, that is, 0.005 (this value is configurable via the argument fup.lod.default). We do not recommend using other models where quantities like partition coefficients must be predicted using Fup. We also do not recommend including the value 0.005 in training sets for Fup predictive models.

Note that in some cases the Funbound.plasma (fup) and the intrinsic clearance (clint) are provided as a series of numbers separated by commas. These values are the result of Bayesian analysis and characterize a distribution: the first value is the median of the distribution, while the second and third values are the lower and upper 95th percentile (that is quantile 2.5 and 97.5) respectively. For intrinsic clearance a fourth value indicating a p-value for a decrease is provided. Typically 4000 samples were used for the Bayesian analysis, such that a p-value of "0" is equivalent to "<0.00025". See Wambaugh et al. (2019) for more details. If argument median.only == TRUE then only the median is reported for parameters with Bayesian analysis distributions. If the 95 credible interval spans the range of 0.1 to 0.9 and fup.ci.cutoff is set to TRUE, i.e., the default setting, then the Fup is treated as too uncertain and the value NA is given.

References

Rotroff, Daniel M., et al. "Incorporating human dosimetry and exposure into high-throughput in vitro toxicity screening." Toxicological Sciences 117.2 (2010): 348-358.

Waters, Nigel J., et al. "Validation of a rapid equilibrium dialysis approach for the measurement of plasma protein binding." Journal of pharmaceutical sciences 97.10 (2008): 4586-4595.

Wambaugh, John F., et al. "Assessing toxicokinetic uncertainty and variability in risk prioritization." Toxicological Sciences 172.2 (2019): 235-251.

Examples

Run this code

# \donttest{
# List all CAS numbers for which the 3compartmentss model can be run in humans: 
get_cheminfo()

get_cheminfo(info=c('compound','funbound.plasma','logP'),model='pbtk') 
# See all the data for humans:
get_cheminfo(info="all")

TPO.cas <- c("741-58-2", "333-41-5", "51707-55-2", "30560-19-1", "5598-13-0", 
"35575-96-3", "142459-58-3", "1634-78-2", "161326-34-7", "133-07-3", "533-74-4", 
"101-05-3", "330-54-1", "6153-64-6", "15299-99-7", "87-90-1", "42509-80-8", 
"10265-92-6", "122-14-5", "12427-38-2", "83-79-4", "55-38-9", "2310-17-0", 
"5234-68-4", "330-55-2", "3337-71-1", "6923-22-4", "23564-05-8", "101-02-0", 
"140-56-7", "120-71-8", "120-12-7", "123-31-9", "91-53-2", "131807-57-3", 
"68157-60-8", "5598-15-2", "115-32-2", "298-00-0", "60-51-5", "23031-36-9", 
"137-26-8", "96-45-7", "16672-87-0", "709-98-8", "149877-41-8", "145701-21-9", 
"7786-34-7", "54593-83-8", "23422-53-9", "56-38-2", "41198-08-7", "50-65-7", 
"28434-00-6", "56-72-4", "62-73-7", "6317-18-6", "96182-53-5", "87-86-5", 
"101-54-2", "121-69-7", "532-27-4", "91-59-8", "105-67-9", "90-04-0", 
"134-20-3", "599-64-4", "148-24-3", "2416-94-6", "121-79-9", "527-60-6", 
"99-97-8", "131-55-5", "105-87-3", "136-77-6", "1401-55-4", "1948-33-0", 
"121-00-6", "92-84-2", "140-66-9", "99-71-8", "150-13-0", "80-46-6", "120-95-6",
"128-39-2", "2687-25-4", "732-11-6", "5392-40-5", "80-05-7", "135158-54-2", 
"29232-93-7", "6734-80-1", "98-54-4", "97-53-0", "96-76-4", "118-71-8", 
"2451-62-9", "150-68-5", "732-26-3", "99-59-2", "59-30-3", "3811-73-2", 
"101-61-1", "4180-23-8", "101-80-4", "86-50-0", "2687-96-9", "108-46-3", 
"95-54-5", "101-77-9", "95-80-7", "420-04-2", "60-54-8", "375-95-1", "120-80-9",
"149-30-4", "135-19-3", "88-58-4", "84-16-2", "6381-77-7", "1478-61-1", 
"96-70-8", "128-04-1", "25956-17-6", "92-52-4", "1987-50-4", "563-12-2", 
"298-02-2", "79902-63-9", "27955-94-8")
httk.TPO.rat.table <- subset(get_cheminfo(info="all",species="rat"),
 CAS %in% TPO.cas)
 
httk.TPO.human.table <- subset(get_cheminfo(info="all",species="human"),
 CAS %in% TPO.cas)
 
# create a data.frame with all the Fup values, we ask for model="schmitt" since
# that model only needs fup, we ask for "median.only" because we don't care
# about uncertainty intervals here:
fup.tab <- get_cheminfo(info="all",median.only=TRUE,model="schmitt")
# calculate the median, making sure to convert to numeric values:
median(as.numeric(fup.tab$Human.Funbound.plasma),na.rm=TRUE)
# calculate the mean:
mean(as.numeric(fup.tab$Human.Funbound.plasma),na.rm=TRUE)
# count how many non-NA values we have (should be the same as the number of 
# rows in the table but just in case we ask for non NA values:
sum(!is.na(fup.tab$Human.Funbound.plasma))
# }

Run the code above in your browser using DataLab