Computes and extracts feature expressions for features
used in a familiarEnsemble
object.
extract_feature_expression(
object,
data,
feature_similarity,
sample_similarity,
feature_cluster_method = waiver(),
feature_linkage_method = waiver(),
feature_similarity_metric = waiver(),
sample_cluster_method = waiver(),
sample_linkage_method = waiver(),
sample_similarity_metric = waiver(),
evaluation_times = waiver(),
message_indent = 0L,
verbose = FALSE,
...
)
A list with a data.table containing feature expressions.
A familiarEnsemble
object, which is an ensemble of one or more
familiarModel
objects.
A dataObject
object, data.table
or data.frame
that
constitutes the data that are assessed.
Table containing pairwise distance between
sample. This is used to determine cluster information, and indicate which
samples are similar. The table is created by the
extract_sample_similarity
method.
The method used to perform clustering. These are
the same methods as for the cluster_method
configuration parameter:
none
, hclust
, agnes
, diana
and pam
.
none
cannot be used when extracting data regarding mutual correlation or
feature expressions.
If not provided explicitly, this parameter is read from settings used at
creation of the underlying familiarModel
objects.
The method used for agglomerative clustering in
hclust
and agnes
. These are the same methods as for the
cluster_linkage_method
configuration parameter: average
, single
,
complete
, weighted
, and ward
.
If not provided explicitly, this parameter is read from settings used at
creation of the underlying familiarModel
objects.
Metric to determine pairwise similarity
between features. Similarity is computed in the same manner as for
clustering, and feature_similarity_metric
therefore has the same options
as cluster_similarity_metric
: mcfadden_r2
, cox_snell_r2
,
nagelkerke_r2
, spearman
, kendall
and pearson
.
If not provided explicitly, this parameter is read from settings used at
creation of the underlying familiarModel
objects.
The method used to perform clustering based on
distance between samples. These are the same methods as for the
cluster_method
configuration parameter: hclust
, agnes
, diana
and
pam
.
none
cannot be used when extracting data for feature expressions.
If not provided explicitly, this parameter is read from settings used at
creation of the underlying familiarModel
objects.
The method used for agglomerative clustering in
hclust
and agnes
. These are the same methods as for the
cluster_linkage_method
configuration parameter: average
, single
,
complete
, weighted
, and ward
.
If not provided explicitly, this parameter is read from settings used at
creation of the underlying familiarModel
objects.
Metric to determine pairwise similarity
between samples. Similarity is computed in the same manner as for
clustering, but sample_similarity_metric
has different options that are
better suited to computing distance between samples instead of between
features: gower
, euclidean
.
The underlying feature data is scaled to the \([0, 1]\) range (for
numerical features) using the feature values across the samples. The
normalisation parameters required can optionally be computed from feature
data with the outer 5% (on both sides) of feature values trimmed or
winsorised. To do so append _trim
(trimming) or _winsor
(winsorising) to
the metric name. This reduces the effect of outliers somewhat.
If not provided explicitly, this parameter is read from settings used at
creation of the underlying familiarModel
objects.
One or more time points that are used for in analysis of
survival problems when data has to be assessed at a set time, e.g.
calibration. If not provided explicitly, this parameter is read from
settings used at creation of the underlying familiarModel
objects. Only
used for survival
outcomes.
Number of indentation steps for messages shown during computation and extraction of various data elements.
Flag to indicate whether feedback should be provided on the computation and extraction of various data elements.
Unused arguments.