Compute differences of carbon oxidation state, stoichiometric hydration state and other compositional metrics between groups of up- and down-regulated proteins.
get_comptab(pdat, var1 = "ZC", var2 = "nH2O", plot.it = FALSE,
mfun = "median", oldstyle = FALSE, basis = getOption("basis"))
list, data object generated by a pdat_
function
character, the first variable
character, the second variable
logical, make a scatterplot?
character, either median or mean
logical, also calculate CLES
and p-values?
character, keyword for basis species to use
A data frame is returned invisibly containing the columns dataset, description, n1 (number of down-regulated proteins), n2 (number of up-regulated proteins), followed two sets of columns for the variables.
These are denoted generically as (var.mfun1, var.mfun2, var.diff, var.CLES, var.p.value), where var is replaced by the name of var1
or var2
, and mfun is replaced by the value of mfun
.
For example, ZC.median1 and ZC.median2 are the median of the down- and up-regulated proteins, respectively.
The available variables are:
ZC | average oxidation state of carbon (; see ZCAA ) |
nH2O | stoichiometric hydration state per residue (; see H2OAA ) |
nO2 | stoichiometric oxidation state per residue (; see O2AA ) |
V0 | standard molal volume per residue |
nAA | protein length (number of amino acids) |
GRAVY | grand average of hydropathicity (see GRAVY ) |
pI | isoelectric point (see pI ) |
PS_TPPG17 | phylostratum (see PS ) |
PS_LMM16 | phylostratum (see PS ) |
MW | molecular weight per residue |
Differentially expressed proteins are identified by the value of pdat$up2
(TRUE for up-regulated proteins and FALSE for down-regulated proteins).
The differences are calculated as (median for up-regulated proteins) - (median for down-regulated proteins); if mfun
is mean, means of the groups are used instead.
If oldstyle
is TRUE, the function also calculates the common language effect size (CLES
, in percent) and p-value for each variable.
The basis
argument is used to select the basis species, which are used for the calculation of and .
The default for getOption("basis")
is to use the QEC basis species (see metrics
).
Volume is calculated using amino acid group additivity as described by Dick et al. (2006).
Phylostrata are not compositional metrics, but are retrieved by matching UniProt accession numbers in a data file (see PS
).
Because phylostratum numbers are discrete values, mean values are calculated regardless of the value of mfun
.
Set plot.it
to TRUE
to make a scatterplot.
Open red squares and filled blue circles stand for up-regulated and down-regulated proteins, respectively.
Dick, J. M., LaRowe, D. E. and Helgeson, H. C. (2006) Temperature, pressure, and electrochemical constraints on protein speciation: Group additivity calculation of the standard molal thermodynamic properties of ionized unfolded proteins. Biogeosciences 3, 311--336. https://doi.org/10.5194/bg-3-311-2006
# NOT RUN {
pd <- pdat_colorectal("JKMF10")
# default variables: ZC and nH2O
get_comptab(pd, plot.it = TRUE)
# protein length and per-residue volume
get_comptab(pd, "nAA", "V0", plot.it = TRUE)
# }
Run the code above in your browser using DataLab