Get differentially expressed proteins and amino acid compositions.
pdat_breast(dataset = 2020)
pdat_colorectal(dataset = 2020)
pdat_liver(dataset = 2020)
pdat_lung(dataset = 2020)
pdat_pancreatic(dataset = 2020)
pdat_prostate(dataset = 2020)
pdat_hypoxia(dataset = 2020)
pdat_secreted(dataset = 2020)
pdat_3D(dataset = 2020)
pdat_glucose(dataset = 2020)
pdat_osmotic_bact(dataset = 2020)
pdat_osmotic_euk(dataset = 2020)
pdat_osmotic_halo(dataset = 2020)
.pdat_multi(dataset = 2020)
.pdat_osmotic(dataset = 2017)
character, dataset name
A list consisting of:
dataset
Name of the dataset
description
Descriptive text for the dataset, used for making the tables in the vignettes (see mkvig
)
pcomp
UniProt IDs together with amino acid compositions obtained using protcomp
up2
Logical vector with length equal to the number of proteins; TRUE for up-regulated proteins and FALSE for down-regulated proteins
The pdat_
functions assemble lists of up- and down-regulated proteins and retrieve their amino acid compositions using protcomp
.
The result can be used with get_comptab
to make a table of compositional metrics that can then be plotted with diffplot
.
If dataset
is 2020 (the default) or 2017, the function returns the names of all datasets in the compilation for the respective year.
Each dataset name starts with a reference key indicating the study (i.e. paper or other publication) where the data were reported. The reference keys are made by combining the first characters of the authors' family names with the 2-digit year of publication.
If a study has more than one dataset, the reference key is followed by an underscore and an identifier for the particular dataset.
This identifier is saved in the variable named stage
in the functions, but can be any descriptive text.
To retrieve the data, provide a single dataset name in the dataset
argument.
Protein expression data is read from the CSV files stored in extdata/expression/
, under the subdirectory corresponding to the name of the pdat_
function.
Some of the functions also read amino acid compositions (e.g. for non-human proteins) from the files in extdata/aa/
.
Descriptions for each function:
pdat_colorectal
, pdat_pancreatic
, pdat_breast
, pdat_lung
, pdat_prostate
, and pdat_liver
retrieve data for protein expression in different cancer types.
pdat_hypoxia
gets data for cellular extracts in hypoxia and pdat_secreted
gets data for secreted proteins (e.g. exosomes) in hypoxia.
pdat_3D
retrieves data for 3D (e.g. tumor spheroids and aggregates) compared to 2D (monolayer) cell culture.
.pdat_osmotic
retrieves data for hyperosmotic stress, for the 2017 compilation only.
In 2020, this compilation was expanded and split into pdat_osmotic_bact
(bacteria), pdat_osmotic_euk
(eukaryotic cells) and pdat_osmotic_halo
(halophilic bacteria and archaea).
pdat_glucose
gets data for high-glucose experiments in eukaryotic cells.
.pdat_multi
retrieves data for studies that have multiple types of datasets (e.g. both cellular and secreted proteins in hypoxia), and is used internally by the specific functions (e.g. pdat_hypoxia
and pdat_secreted
).
# NOT RUN {
# List datasets in the 2017 complilation for colorectal cancer
pdat_colorectal(2017)
# Get proteins and amino acid compositions for one dataset
pdat_colorectal("JKMF10")
# }
Run the code above in your browser using DataLab