Learn R Programming

sherlock

The {sherlock} R package provides powerful graphical displays and statistical tools to aid structured problem solving and diagnosis. The functions of the package are especially useful for applying the process of elimination as a problem diagnosis technique. {sherlock} was designed to seamlessly work with the tidyverse set of packages.

More specifically, {sherlock} features functionality to

  • create a project folder/sub-folder structure and .Rproj file for your problem solving project with one function call
  • read in tabular data from various sources
  • facilitate reading in and cleaning many files all at once (for example raw data from data loggers or sensors)
  • create powerful and highly customizable visual displays, some of which are non-existent even in popular statistical packages
  • use a custom {sherlock} ggplot2 theme, which offers a clean-looking visual appearance
  • use sample datasets to test plotting functions
  • save data and plots into an Excel file

“That is to say, nature’s laws are causal; they reveal themselves by comparison and difference, and they operate at every multi-variate space-time point” - Edward Tufte

I would love to hear your feedback on sherlock. You can leave a note on current issues, bugs and even request new features here.

sherlock 0.6.0 is now released. In addition to fixing a few bugs and making enhancements to already-existing functionality, new plotting, statistical analysis and helper functions have been added, such as:

  • A new set of plotting functions and statistical tests called the Tukey-Duckworth test for problem diagnosis. These are plot_tukey_duckworth_test() and plot_tukey_duckworth_paired_test().
  • select_low_high_units() and select_low_high_units_manual(): Automatically or manually select low-high units in a tibble as well as assign them into groups.
  • load_files(), which is a function to read in and clean multiple files. Particularly useful when reading in multiple files having the same variables, for example reading in data from an experiment where data was logged and saved separately for each individual unit. Integration of a custom data cleaning function.
  • create_project_folder(), which is a helper function to quickly create a project folder with a clever sub-folder structure for your project.

Installation

sherlock is available on CRAN and can be installed by running the below script:

install.packages("sherlock")

You can also install the development version of sherlock from GitHub with:

# install.packages("devtools")
devtools::install_github("gaborszabo11/sherlock")

Functions

Plotting functions

draw_multivari_plot()

draw_categorical_scatterplot()

draw_youden_plot()

draw_small_multiples_line_plot()

draw_cartesian_small_multiples()

draw_polar_small_multiples()

draw_interaction_plot()

draw_pareto_chart()

draw_process_behavior_chart()

draw_timeseries_scatterplot()

plot_tukey_duckworth_test()

plot_tukey_duckworth_paired_test()

Helper functions

load_file()

load_files()

create_project_folder()

save_analysis()

normalize_observations()

theme_sherlock()

scale_color_sherlock()

scale_fill_sherlock()

draw_horizontal_reference_line()

draw_vertical_reference_line()

select_low_high_units()

select_low_high_units_manual()

Examples

Here are a few examples:

# Loading libraries
library(sherlock)
library(ggh4x)
#> Loading required package: ggplot2
#> Warning: package 'ggplot2' was built under R version 4.2.2
multi_vari_data %>% 
  draw_multivari_plot(y_var = force, 
                      grouping_var_1 = cycle, 
                      grouping_var_2 = fixture, 
                      grouping_var_3 = line)
library(sherlock)
library(ggh4x)

multi_vari_data_2 %>% 
  draw_multivari_plot(y_var = Length, 
                      grouping_var_1 = Part, 
                      grouping_var_2 = Operator, plot_means = TRUE)
library(sherlock)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

polar_small_multiples_data %>% 
  filter(Mold_Cavity_Number %in% c(4, 6)) %>% 
  rename(Radius = "ID_2") %>% 
  draw_polar_small_multiples(angular_axis   = ID_Measurement_Angle, 
                             x_y_coord_axis = Radius, 
                             grouping_var   = Tip_Bottom, 
                             faceting_var_1 = Mold_Cavity_Number,
                             point_size     = 0.5, 
                             connect_with_lines = TRUE, 
                             label_text_size = 7) +
  scale_y_continuous(limits = c(0.09, 0.115))
#> Scale for y is already present.
#> Adding another scale for y, which will replace the existing scale.
library(sherlock)
library(dplyr)
library(ggh4x)

polar_small_multiples_data %>%
  filter(ID_Measurement_Angle %in% c(0, 45, 90, 135)) %>%
  normalize_observations(y_var = ID, grouping_var = Tip_Bottom, ref_values = c(0.2075, 0.2225)) %>%
  draw_multivari_plot(y_var             = ID_normalized,
                      grouping_var_1    = ID_Measurement_Angle,
                      grouping_var_2    = Mold_Cavity_Number,
                      grouping_var_3    = Tip_Bottom,
                      x_axis_text = 6) +
  draw_horizontal_reference_line(reference_line = 0)
#> Joining, by = "Tip_Bottom"
youden_plot_data_2 %>% 
  draw_youden_plot(x_axis_var  = gage_1, 
                   y_axis_var  = gage_2, 
                   median_line = TRUE)
#> Smoothing formula not specified. Using: y ~ x
youden_plot_data %>% 
  draw_youden_plot(x_axis_var   = measurement_1, 
                   y_axis_var   = measurement_2, 
                   grouping_var = location, 
                   x_axis_label = "Trial 1", 
                   y_axis_label = "Trial 2")
timeseries_scatterplot_data %>%
  draw_timeseries_scatterplot(y_var = y, 
                              grouping_var_1 = date, 
                              grouping_var_2 = cavity, 
                              faceting       = TRUE, 
                              limits         = TRUE, 
                              alpha          = 0.15,
                              line_size      = 0.5, 
                              x_axis_text    = 7,
                              interactive    = FALSE)
#> Joining, by = c("date", "cavity")
#> Warning: Removed 6 rows containing missing values (`geom_point()`).

References

Diagnosing Performance and Reliability, David Hartshorne and The New Science of Fixing Things, 2019

Statistical Engineering - An Algorithm for Reducing Variation in Manufacturing Processes, Stefan H. Steiner and Jock MacKay, 2005

Copy Link

Version

Install

install.packages('sherlock')

Monthly Downloads

162

Version

0.7.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Gabor Szabo

Last Published

June 11th, 2023

Functions in sherlock (0.7.0)

create_project_folder

Create Project Folder
draw_process_behavior_chart

Draw Process Behavior Chart
scale_color_sherlock

Sherlock Color Palettes
load_files

Load Files
multi_vari_data

Multivari Plot Sample Dataset 1
draw_pareto_chart

Draw Pareto Chart
save_analysis

Save Analysis
load_file

Load File
draw_youden_plot

Draw Youden Plot
plot_tukey_duckworth_paired_test

Plot Tukey-Duckworth Paired Test
draw_interaction_plot

Draw Interaction Plot
draw_multivari_plot

Draw Multivari Plot
scale_fill_sherlock

Sherlock Fill Color Palettes
draw_categorical_scatterplot

Draw Categorical Scatter Plot
draw_horizontal_reference_line

Draw horizontal reference line
draw_small_multiples_line_plot

Draw Small Multiples Line Plot
%>%

Pipe operator
plot_tukey_duckworth_test

Plot Tukey-Duckworth Test
polar_small_multiples_data

Polar Small Multiples Sample Dataset
select_low_high_units

Select Low-High Units
normalize_observations

Normalize observations
select_low_high_units_manual

Select Low-High Units
multi_vari_data_2

Multi-Vari Plot Sample Dataset 2
theme_sherlock

Theme Sherlock
small_multiples_data

Small Multiples Sample Dataset
timeseries_scatterplot_data

Timeseries Scatterplot Sample Dataset
youden_plot_data

Youden Plot Sample Dataset
tidyeval

Tidy eval helpers
youden_plot_data_2

Youden Plot Sample Dataset 2
draw_cartesian_small_multiples

Draw Cartesian Small Multiples Plot
draw_multivari_plot_count

Draw Multivari Plot for Counts
draw_pareto_chart_grouped

Draw Grouped Pareto Chart
draw_polar_small_multiples

Draw Polar Small Multiples
draw_vertical_reference_line

Draw vertical reference line
draw_timeseries_scatterplot

Draw Timeseries Scatterplot