Learn R Programming

dlookr (version 0.5.0)

plot_correlate: Visualize correlation plot of numerical data

Description

The plot_correlate() visualize correlation plot for find relationship between two numerical variables.

Usage

plot_correlate(.data, ...)

# S3 method for data.frame plot_correlate(.data, ..., method = c("pearson", "kendall", "spearman"))

Arguments

.data

a data.frame or a tbl_df.

...

one or more unquoted expressions separated by commas. You can treat variable names like they are positions. Positive values select variables; negative values to drop variables. If the first expression is negative, plot_correlate() will automatically start with all variables. These arguments are automatically quoted and evaluated in a context where column names represent column positions. They support unquoting and splicing.

See vignette("EDA") for an introduction to these concepts.

method

a character string indicating which correlation coefficient (or covariance) is to be computed. One of "pearson" (default), "kendall", or "spearman": can be abbreviated.

Details

The scope of the visualization is the provide a correlation information. Since the plot is drawn for each variable, if you specify more than one variable in the ... argument, the specified number of plots are drawn.

See Also

plot_correlate.tbl_dbi, plot_outlier.data.frame.

Examples

Run this code
# NOT RUN {
# Visualize correlation plot of all numerical variables
plot_correlate(heartfailure)

# Select the variable to compute
plot_correlate(heartfailure, creatinine, sodium)
plot_correlate(heartfailure, -creatinine, -sodium)
plot_correlate(heartfailure, "creatinine", "sodium")
plot_correlate(heartfailure, 1)
plot_correlate(heartfailure, creatinine, sodium, method = "spearman")

# Using dplyr::grouped_dt
library(dplyr)

gdata <- group_by(heartfailure, smoking, death_event)
plot_correlate(gdata, "creatinine")
plot_correlate(gdata)

# Using pipes ---------------------------------
# Visualize correlation plot of all numerical variables
heartfailure %>%
  plot_correlate()
# Positive values select variables
heartfailure %>%
  plot_correlate(creatinine, sodium)
# Negative values to drop variables
heartfailure %>%
  plot_correlate(-creatinine, -sodium)
# Positions values select variables
heartfailure %>%
  plot_correlate(1)
# Positions values select variables
heartfailure %>%
  plot_correlate(-1, -3, -5, -7)

# Using pipes & dplyr -------------------------
# Visualize correlation plot of 'creatinine' variable by 'smoking'
# and 'death_event' variables.
heartfailure %>%
group_by(smoking, death_event) %>%
plot_correlate(creatinine)

# Extract only those with 'smoking' variable level is "Yes",
# and visualize correlation plot of 'creatinine' variable by 'hblood_pressure'
# and 'death_event' variables.
heartfailure %>%
 filter(smoking == "Yes") %>%
 group_by(hblood_pressure, death_event) %>%
 plot_correlate(creatinine)
 
# }

Run the code above in your browser using DataLab