Learn R Programming

canprot (version 1.0.0)

get_comptab: Calculate Compositional Differences

Description

Compute differences of carbon oxidation state, stoichiometric hydration state and other compositional metrics between groups of up- and down-regulated proteins.

Usage

get_comptab(pdat, var1 = "ZC", var2 = "nH2O", plot.it = FALSE,
    mfun = "median", oldstyle = FALSE)

Arguments

pdat

list, data object generated by a pdat_ function

var1

character, the first variable

var2

character, the second variable

plot.it

logical, make a scatterplot?

mfun

character, either median or mean

oldstyle

logical, also calculate CLES and p-values?

Value

A data frame is returned invisibly containing the columns dataset, description, n1 (number of down-regulated proteins), n2 (number of up-regulated proteins), followed two sets of columns for the variables. These are denoted generically as (var.mfun1, var.mfun2, var.diff, var.CLES, var.p.value), where var is replaced by the name of var1 or var2, and mfun is replaced by the value of mfun. For example, ZC.median1 and ZC.median2 are the median of the down- and up-regulated proteins, respectively.

Details

The available variables are:

ZC average oxidation state of carbon (; see ZCAA)
nH2O stoichiometric hydration state per residue (; see H2OAA)
nC number of carbon atoms per residue
nN number of nitrogen atoms per residue
nS number of sulfur atoms per residue
V0 standard molal volume per residue
nAA protein length (number of amino acids)
GRAVY grand average of hydropathicity (see GRAVY)
pI isoelectric point (see pI)
PS_TPPG17 phylostratum (see PS)
PS_LMM16 phylostratum (see PS)
MW molecular weight per residue

Volume is calculated using amino acid group additivity as described by Dick et al. (2006).

Differentially expressed proteins are identified by the value of pdat$up2 (TRUE for up-regulated proteins and FALSE for down-regulated proteins). The differences are calculated as (median for up-regulated proteins) - (median for down-regulated proteins); if mfun is mean, means of the groups are used instead. If oldstyle is TRUE, the function also calculates the common language effect size (CLES, in percent) and p-value for each variable.

Phylostrata are not compositional metrics, but are retrieved by matching UniProt accession numbers in a data file (see PS). Because phylostratum numbers are discrete values, mean values are calculated regardless of the value of mfun.

Set plot.it to TRUE to make a scatterplot. Open red squares and filled blue circles stand for up-regulated and down-regulated proteins, respectively.

References

Dick, J. M., LaRowe, D. E. and Helgeson, H. C. (2006) Temperature, pressure, and electrochemical constraints on protein speciation: Group additivity calculation of the standard molal thermodynamic properties of ionized unfolded proteins. Biogeosciences 3, 311--336. https://doi.org/10.5194/bg-3-311-2006

Examples

Run this code
# NOT RUN {
library(CHNOSZ)
pd <- pdat_colorectal("JKMF10")
# default variables: ZC and nH2O
get_comptab(pd, plot.it = TRUE)
# protein length and per-residue volume
get_comptab(pd, "nAA", "V0", plot.it = TRUE)
# }

Run the code above in your browser using DataLab