Learn R Programming

Laurae (version 0.0.0.9001)

partial_dep.plot: Partial Dependency, plotting function

Description

This function is a helper to help plotting partial dependency plots using a provided grid via the specified backend.

Usage

partial_dep.plot(grid_data, backend = "tableplot", label_name = "Target",
  comparator_name = "Evolution", ...)

Arguments

grid_data
Type: data.table. A partial_dep grid_exp output.
backend
Type: logical. What type of backend for plotting to use. Check details for detailed description. Defaults to tableplot.
label_name
Type: character. The label column name in grid_data when using tableplot or lattice.
comparator_name
Type: character. The comparator column name in grid_data when using tableplot or lattice.
...
other arguments to pass to to the plotting backend function.

Value

A plot with the requested backend.

Details

For selecting the plotting backend, it depends on what you are trying to plot (from what you are working from). If you are using single observations (partial_dep.obs), you are provided the following backends:
"tableplot"
Tableplot is the best when it comes to plotting any type of large data. It is the default and the most appropriate for 99% of cases, even if you have million of data points it will be blazing fast. Use this unless you have a rationale reason to use something else.
"car"
Car is the best when it comes to analyzing in depth the output in a scatter plot matrix, but it is extremely slow. Not recommended for more than 10k observations and 5 columns.
"lattice"
Lattice is used with parallel plots. Not recommended for more than 50k observations and 5 columns.
"ggplot"
ggplot is used for scatter plot matrix and correlation measurement. Not recommended for more than 20k observations and 5 columns.
"plotly"
Combines ggplot with Plotly for interactive graphics. Not recommended for more than 1k observations and 5 columns.
"base"
Base ships with R and thus is simple, but is slow. Not recommended for more than 50k observations and 5 columns.
If you are using multiple observations (partial_dep.obs_all), you are provided the following backends:
c("tableplot")
Tableplot is used to output plot, but it is clearly not the recommended way of doing it. It is the default in case you have too many points.
c("ggplot", "boxplot")
ggplot is used to draw boxplots to check for distribution in boxplots, grouped by feature.
c("ggplot", "point")
ggplot is used to draw points and check for evolution, grouped both by feature and evolution.
c("ggplot", "line")
ggplot is used to draw points and lines and check for evolution, grouped both by feature and evolution.
c("ggplot", "line2")
ggplot is used to draw points and lines and check for evolution, grouped both by feature and evolution, interactively, protected against "2 point only" error.
c("plotly", "boxplot")
plotly + ggplot is used to draw boxplots to check for distribution in boxplots, grouped by feature, interactively.
c("plotly", "point")
plotly + ggplot is used to draw points and check for evolution, grouped both by feature and evolution, interactively.
c("plotly", "line")
plotly + ggplot is used to draw points and line and check for evolution, grouped both by feature and evolution, interactively.
c("plotly", "line2")
plotly + ggplot is used to draw points and line and check for evolution, grouped both by feature and evolution, interactively, protected against "2 point only" error.

Examples

Run this code
## Not run: ------------------------------------
# # Train you supervised machine learning model
# # ...
# 
# # Prepare partial dependence content
# my_grid <- partial_dep.obs(...)
# 
# # Plot partial dependence content
# partial_dep.plot(my_grid$grid_exp, backend = "tableplot")
## ---------------------------------------------


Run the code above in your browser using DataLab