Learn R Programming

explore

Simplifies Exploratory Data Analysis:

  • Interactive data exploration: explore()
  • Use AI to unveil hidden patterns in your data (xgboost, RF, logreg, DT): explain_*()
  • Generate an automated report of your data (or patterns in your data): report()
  • Manual exploration: explore(), describe(), explain_*(), abtest(), ...
  • 18 ready to use datasets for teaching & testing: use_data_*(), create_data_*()
# install from CRAN
install.packages("explore")

Examples

# interactive data exploration
library(explore)
beer <- use_data_beer()
beer |> explore()
# describe data
beer |> describe()
# A tibble: 11 × 8
   variable          type     na na_pct unique    min    mean    max
   <chr>             <chr> <int>  <dbl>  <int>  <dbl>   <dbl>  <dbl>
 1 name              chr       0    0      161   NA     NA      NA  
 2 brand             chr       0    0       29   NA     NA      NA  
 3 country           chr       0    0        3   NA     NA      NA  
 4 year              dbl       0    0        1 2023   2023    2023  
 5 type              chr       0    0        3   NA     NA      NA  
 6 color_dark        dbl       0    0        2    0      0.09    1  
 7 alcohol_vol_pct   dbl       2    1.2     35    0      4.32    8.4
 8 original_wort     dbl       5    3.1     54    5.1   11.3    18.3
 9 energy_kcal_100ml dbl      11    6.8     34   20     39.9    62  
10 carb_g_100ml      dbl      16    9.9     44    1.5    3.53    6.7
11 sugar_g_100ml     dbl      16    9.9     26    0      0.72    4.6
# explore data manually
beer |> explore(type)
beer |> explore(energy_kcal_100ml)
beer |> explore(energy_kcal_100ml, target = type)
beer |> explore(alcohol_vol_pct, energy_kcal_100ml, target = type)
# explore manually with color and interactive
beer |> 
  explore(sugar_g_100ml, color = "gold") |> 
  interact()

Get started

Copy Link

Version

Install

install.packages('explore')

Monthly Downloads

2,507

Version

1.3.4

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Roland Krasser

Last Published

March 30th, 2025

Functions in explore (1.3.4)

create_data_buy

Create data buy
create_data_empty

Create an empty dataset
clean_var

Clean variable
decrypt

decrypt text
create_data_churn

Create data churn
count_pct

Adds percentage to dplyr::count()
describe

Describe a dataset or variable
cut_vec_num_avg

Cut a variable
create_data_unfair

Create data unfair
describe_num

Describe numerical variable
create_notebook_explore

Generate a notebook
describe_tbl

Describe table
create_data_abtest

Create data of A/B testing
drop_var_not_numeric

Drop all not numeric variables
explain_forest

Explain a target using Random Forest.
explain_logreg

Explain a binary target using a logistic regression (glm). Model chosen by AIC in a Stepwise Algorithm (MASS::stepAIC()).
drop_var_no_variance

Drop all variables with no variance
create_data_newsletter

Create data newsletter
create_data_esoteric

Create data esoteric
drop_obs_with_na

Drop all observations with NA-values
drop_obs_if

Drop all observations where expression is true
explore-package

explore: Simplifies Exploratory Data Analysis
data_dict_md

Create a data dictionary Markdown file
describe_cat

Describe categorical variable
describe_all

Describe all variables of a dataset
explore

Explore a dataset or variable
create_data_person

Create data person
explore_shiny

Explore dataset interactive
log_info_if

Log conditional
explore_targetpct

Explore variable + binary target (values 0/1)
explore_bar

Explore categorical variable using bar charts
balance_target

Balance target variable
explore_all

Explore all variables
explore_col

Explore data without aggregation (label + value)
explore_cor

Explore the correlation between two variables
format_target

Format target
mix_color

Mix colors
rescale01

Rescales a numeric variable into values between 0 and 1
format_type

Format type description
explore_count

Explore count data (categories + frequency)
use_data_mpg

Use the mpg data set
drop_var_with_na

Drop all variables with NA-values
show_color

Show color vector as ggplot
use_data_iris

Use the iris flower data set
drop_var_by_names

Drop variables by name
drop_var_low_variance

Drop all variables with low variance
create_data_random

Create data random
explain_tree

Explain a target using a simple decision tree (classification or regression)
use_data_mtcars

Use the mtcars data set
plot_legend_targetpct

Plots a legend that can be used for explore_all with a binary target
encrypt

encrypt text
use_data_diamonds

Use the diamonds data set
use_data_beer

Use the beer data set
plot_text

Plot a text
use_data_penguins

Use the penguins data set
format_num_kMB

Format number as character string (kMB)
format_num_auto

Format number as character string (auto)
explain_xgboost

Explain a binary target using xgboost
get_var_buckets

Put variables into "buckets" to create a set of plots instead one large plot
explore_density

Explore density of variable
get_type

Return type of variable
explore_tbl

Explore table
simplify_text

Simplifies a text string
target_explore_cat

Explore categorical variable + target
format_num_space

Format number as character string (space as big.mark)
get_nrow

Get number of rows for a grid plot
get_color

Get predefined colors
plot_var_info

Plot a variable info
predict_target

Predict target using a trained model.
use_data_titanic

Use the titanic data set
use_data_starwars

Use the starwars data set
guess_cat_num

Return if variable is categorical or numerical
use_data_wordle

Use the wordle data set
replace_na_with

Replace NA
interact

Make a explore-plot interactive
target_explore_num

Explore Nuberical variable + target
report

Generate a report of all variables
total_fig_height

Get fig.height for RMarkdown-junk using explore_all()
weight_target

Weight target variable
yyyymm_calc

Calculate with periods (format yyyymm)
add_var_random_int

Add a random integer variable to dataset
add_var_random_01

Add a random 0/1 variable to dataset
add_var_random_cat

Add a random categorical variable to dataset
add_var_id

Add a variable id at first column in dataset
abtest

A/B testing
abtest_shiny

A/B testing interactive
add_var_random_moon

Add a random moon variable to dataset
abtest_targetpct

A/B testing comparing percent per group
add_var_random_dbl

Add a random double variable to dataset
abtest_targetnum

A/B testing comparing two mean
add_var_random_starsign

Add a random starsign variable to dataset
create_data_app

Create data app
check_vec_low_variance

Check vector for low variance