This function extracts Wald or Likelihood Ratio test results from a sleuth object.
sleuth_results(obj, test, test_type = "wt", which_model = "full",
rename_cols = TRUE, show_all = TRUE,
pval_aggregate = obj$pval_aggregate, ...)
a sleuth
object
a character string denoting the test to extract. Possible tests can be found by using models(obj)
.
'wt' for Wald test or 'lrt' for Likelihood Ratio test.
a character string denoting the model. If extracting a wald test, use the model name. Not used if extracting a likelihood ratio test.
if TRUE
will rename some columns to be shorter and
consistent with the vignette
if TRUE
will show all transcripts (not only the ones
passing filters). The transcripts that do not pass filters will have
NA
values in most columns.
if TRUE
and both target_mapping
and aggregation_column
were provided,
to sleuth_prep
, use lancaster's method to aggregate p-values by the aggregation_column
.
advanced options for sleuth_results. See details.
If pval_aggregate
is FALSE
, returns a data.frame
with the following columns:
target_id
: transcript name, e.g. "ENST#####" (dependent on the transcriptome used in kallisto).
If gene_mode
is TRUE, this will instead be the IDs specified by the obj$gene_column
from obj$target_mapping
.
...
: if there is a target mapping data frame, all of the annotations columns are added from
obj$target_mapping
before the other columns.
pval
: p-value of the chosen model
qval
: false discovery rate adjusted p-value, using Benjamini-Hochberg (see p.adjust
)
test_stat
(LRT only): Chi-squared test statistic (likelihood ratio test). Only seen with Likelihood Ratio test results.
rss
(LRT only): the residual sum of squares under the "null model". Only seen with Likelihood Ratio test results.
degrees_free
(LRT only): the degrees of freedom (equal to difference between the two models). Only seen with Likelihood Ratio test results.
b
(Wald only): 'beta' value (effect size). Technically a biased estimator of the fold change. Only seen with Wald test results.
se_b
(Wald only): standard error of the beta. Only seen with Wald test results.
mean_obs
: mean of natural log counts of observations
var_obs
: variance of observation
tech_var
: technical variance of observation from the bootstraps (named 'sigma_q_sq' if rename_cols is FALSE
)
sigma_sq
: raw estimator of the variance once the technical variance has been removed
smooth_sigma_sq
: smooth regression fit for the shrinkage estimation
final_simga_sq
: max(sigma_sq, smooth_sigma_sq); used for covariance estimation of beta
(named 'smooth_sigma_sq_pmax' if rename_cols is FALSE
)
If pval_aggregate
is TRUE
, returns a data.frame
with the following columns:
target_id
: gene ID specified by obj$gene_column
, e.g. "ENSG#####" (dependent on the transcriptome
used in kallisto).
...
: all of the additional annotation columns (not 'target_id'
or obj$gene_column
) are
added from obj$target_mapping
before the other columns.
num_aggregated_transcripts
: the number of transcripts aggregated for a given gene. These only include
filtered transcripts.
sum_mean_obs_counts
: this is the sum of the mean observations across all filtered transcripts
within a gene. Note that the weighting function is applied before summing.
pval
: the aggregated p-value calculated by the lancaster method. See the aggregation package for details.
qval
: adjusted p-values using the Benchamini-Hochberg method.
The columns returned by this function will depend on a few factors: whether the test is a Wald test or
Likelihood Ratio test, and whether pval_aggregate
is TRUE
.
The sleuth model is a measurement error in the response model. It attempts to segregate the variation due to
the inference procedure by kallisto from the variation due to the covariates -- the biological and technical
factors of the experiment (represented by the columns in obj$sample_to_covariates
). For the Wald test,
the 'b' column represents the estimate of the selected coefficient. In the default setting, it is analogous to,
but not equivalent to, the fold-change. The transformed values are on the natural-log scale, and so the
the estimated coefficient is also on the natural-log scale. This value is taking into account the estimated
'inferential variance' estimated from the kallisto bootstraps.
If the user wishes to get gene-level results from this function, there are two ways of doing so:
p-value aggregation mode: if pval_aggregate
argument is TRUE, this function will
aggregate the transcript-level p-values to the gene-level using the lancaster method. See below for advanced
options related to this mode. This is the recommended way to do gene-level aggregation. See the paper
count aggregation mode: This is the gene-level aggregation method introduced in sleuth version 0.28.1.
This mode is activated if obj$gene_mode
is TRUE
. In this mode, the modeling and testing was done
using aggregated counts (or TPMs), and so the results are same as for the transcript-level results, except the
target IDs are now gene IDs instead of transcript IDs.
An important note if pval_aggregate
or the old gene_mode
is TRUE
: when combining the
gene annotations from obj$target_mapping
, all of the columns except for the transcript ID,
obj$target_mapping$target_id
, will be included. If there are transcript-level entries for any of the other
columns, this will result in duplicate rows in the results table (usually an undesirable result).
Here are advanced options for customizing the p-value aggregation procedure:
weight_func
: if pval_aggregate
is TRUE
, then this is used to weight the p-values for
lancaster's method. This function must take the observed means of the transcripts as the only defined argument.
The default is identity
.
sleuth_wt
and sleuth_lrt
to compute tests, models
to
view which models, tests
to view which tests were performed (and can be extracted)
# NOT RUN {
models(sleuth_obj) # for this example, assume the formula is ~condition,
and a coefficient is IP
results_table <- sleuth_results(sleuth_obj, 'conditionIP')
# }
Run the code above in your browser using DataLab