plotting the condensation potential is meant as a decision aid
for which variables to include in an alluvial plot. All variables are
transformed to categoric variables and then two variables are selected by
which the dataframe will be grouped and summarized by. The pair that
results in the greatest condensation of the original dataframe is selected.
Then the next variable which offers the greatest condensation potential is
chosen until all variables have been added. The condensation in percent is
then plotted for each step along with the number of groups (flows) in the
dataframe. By experience it is not advisable to have more than 1500 flows
because then the alluvial plot will take a long time to render. If there is
a particular variable of interest in the dataframe this variable can be
chosen as a starting variable.
Usage
plot_condensation(df, first = NULL)
Value
ggplot2 plot
Arguments
df
dataframe
first
unquoted expression or string denoting the first variable to be
picked for condensation, Default: NULL