The performance_metrics object consists of nine tables (slots) that combined
form a relational database of a subset of performance metrics. Each
performance metric is an observation (row) in the scores
table (first
table).
performance_metrics
A table of PGS Performance Metrics (PPM). Each PPM (row) is
uniquely identified by the ppm_id
column. Columns:
A PGS Performance Metrics identifier. Example: "PPM000001"
.
Polygenic Score (PGS) identifier.
The author-reported trait that the PGS has been
developed to predict. Example: "Breast Cancer"
.
Comma-separated list of covariates used in the prediction model to evaluate the PGS.
Any other information relevant to the understanding of the performance metrics.
publications
A table of publications. Each publication (row) is
uniquely identified by the column pgp_id
. Columns:
A PGS Performance Metrics identifier. Example: "PPM000001"
.
PGS Publication identifier. Example: "PGP000001"
.
PubMed
identifier. Example: "25855707"
.
Publication date. Example: "2020-09-28"
. Note
that the class of publication_date
is Date
.
Abbreviated name of the journal. Example: "Am J Hum
Genet"
.
Publication title.
First author of the publication. Example:
'Mavaddat N'
.
Digital Object Identifier (DOI). This variable is also curated to
allow unpublished work (e.g. preprints) to be added to the catalog. Example:
"10.1093/jnci/djv036"
.
sample_sets
A table of sample sets. Each sample set (row) is uniquely
identified by the column pss_id
. Columns:
A PGS Performance Metrics identifier. Example: "PPM000001"
.
A PGS Sample Set identifier. Example: "PSS000042"
.
samples
A table of samples. Each sample (row) is uniquely identified by
the combination of values from the columns: ppm_id
, pss_id
,
and sample_id
. Columns:
A PGS Performance Metrics identifier. Example: "PPM000001"
.
A PGS Sample Set identifier. Example: "PSS000042"
.
Sample identifier. This is a surrogate key to identify each sample.
Sample stage: should be always Evaluation ("eval"
).
Number of individuals included in the sample.
Number of cases.
Number of controls.
Percentage of male participants.
Detailed phenotype description.
Author reported ancestry is mapped to the best matching
ancestry category from the NHGRI-EBI GWAS Catalog framework (see
ancestry_categories
) for possible values.
A more detailed description of sample ancestry that usually matches the most specific description described by the authors (e.g. French, Chinese).
Author reported countries of recruitment (if available).
Any additional description not captured in the other columns (e.g. founder or genetically isolated populations, or further description of admixed samples).
Associated GWAS Catalog study accession identifier, e.g.,
"GCST002735"
.
PubMed identifier.
Any additional description about the samples (e.g. sub-cohort information).
demographics
A table of sample demographics' variables. Each
demographics' variable (row) is uniquely identified by the combination of
values from the columns: ppm_id
, pss_id
, sample_id
,
and variable
. Columns:
A PGS Performance Metrics identifier. Example: "PPM000001"
.
A PGS Sample Set identifier. Example: "PSS000042"
.
Sample identifier. This is a surrogate identifier to identify each sample.
Demographics variable. Following columns report about the indicated variable.
Type of statistical estimate for variable.
The variable's statistical value.
Unit of the variable.
Measure of statistical dispersion for variable, e.g. standard error (se) or standard deviation (sd).
The value of the measure of dispersion.
Type of statistical interval for variable: range, iqr (interquartile), ci (confidence interval).
Interval lower bound.
Interval upper bound.
cohorts
A table of cohorts. Each cohort (row) is uniquely identified by
the combination of values from the columns: ppm_id
, sample_id
and cohort_symbol
. Columns:
A PGS Performance Metrics identifier. Example: "PPM000001"
.
Sample identifier. This is a surrogate key to identify each sample.
Cohort symbol.
Cohort full name.
pgs_effect_sizes
A table of effect sizes per standard deviation change
in PGS. Examples include regression coefficients (betas) for continuous
traits, odds ratios (OR) and/or hazard ratios (HR) for dichotomous traits
depending on the availability of time-to-event data. Each effect size is
uniquely identified by the combination of values from the columns:
ppm_id
and effect_size_id
. Columns:
A PGS Performance Metrics identifier. Example: "PPM000001"
.
Effect size identifier. This is a surrogate identifier to identify each effect size.
Long notation of the effect size (e.g. Odds Ratio).
Short notation of the effect size (e.g. OR).
The estimate's value.
Unit of the estimate.
Measure of statistical dispersion for variable, e.g. standard error (se) or standard deviation (sd).
The value of the measure of dispersion.
Type of statistical interval for variable: range, iqr (interquartile), ci (confidence interval).
Interval lower bound.
Interval upper bound.
pgs_classification_metrics
A table of classification metrics. Examples include the Area under the Receiver Operating Characteristic (AUROC) or Harrell's C-index (Concordance statistic). Columns:
A PGS Performance Metrics identifier. Example: "PPM000001"
.
Classification metric identifier. This is a surrogate identifier to identify each classification metric.
Long notation of the classification metric (e.g. Concordance Statistic).
Short notation classification metric (e.g. C-index).
The estimate's value.
Unit of the estimate.
Measure of statistical dispersion for variable, e.g. standard error (se) or standard deviation (sd).
The value of the measure of dispersion.
Type of statistical interval for variable: range, iqr (interquartile), ci (confidence interval).
Interval lower bound.
Interval upper bound.
pgs_other_metrics
A table of other metrics that are neither effect sizes nor classification metrics. Examples include: R² (proportion of the variance explained), or reclassification metrics. Columns:
A PGS Performance Metrics identifier. Example: "PPM000001"
.
Other metric identifier. This is a surrogate identifier to identify each metric.
Long notation of the metric. Example: "Proportion of the variance explained".
Short notation metric. Example: "R²".
The estimate's value.
Unit of the estimate.
Measure of statistical dispersion for variable, e.g. standard error (se) or standard deviation (sd).
The value of the measure of dispersion.
Type of statistical interval for variable: range, iqr (interquartile), ci (confidence interval).
Interval lower bound.
Interval upper bound.