- df
Dataframe. It doesn't matter if it's got non-numerical
columns: they will be filtered.
- method
Character. Any of: c("pearson", "kendall", "spearman").
- use
Character. Method for computing covariances in the presence
of missing values. Check stats::cor
for options.
- pvalue
Boolean. Returns a list, with correlations and statistical
significance (p-value) for each value.
- padjust
Character. NULL to skip or any of p.adjust.methods
to
calculate adjust p-values for multiple comparisons using p.adjust()
.
- half
Boolean. Return only half of the matrix? The redundant
symmetrical correlations will be NA
.
- dec
Integer. Number of decimals to round correlations and p-values.
- ignore
Vector or character. Which column should be ignored?
- dummy
Boolean. Should One Hot (Smart) Encoding (ohse()
)
be applied to categorical columns?
- redundant
Boolean. Should we keep redundant columns? i.e. If the
column only has two different values, should we keep both new columns?
Is set to NULL
, only binary variables will dump redundant columns.
- logs
Boolean. Calculate log(x)+1 for numerical columns?
- limit
Integer. Limit one hot encoding to the n most frequent
values of each column. Set to NA
to ignore argument.
- top
Integer. Select top N most relevant variables? Filtered
and sorted by mean of each variable's correlations.
- ...
Additional parameters passed to ohse
, corr
,
and/or cor.test
.