Learn R Programming

janitor (version 2.1.0)

get_dupes: Get rows of a data.frame with identical values for the specified variables.

Description

For hunting duplicate records during data cleaning. Specify the data.frame and the variable combination to search for duplicates and get back the duplicated rows.

Usage

get_dupes(dat, ...)

Value

Returns a data.frame with the full records where the specified variables have duplicated values, as well as a variable dupe_count showing the number of rows sharing that combination of duplicated values. If the input data.frame was of class tbl_df, the output is as well.

Arguments

dat

The input data.frame.

...

Unquoted variable names to search for duplicates. This takes a tidyselect specification.

Examples

Run this code
get_dupes(mtcars, mpg, hp)

# or called with the magrittr pipe %>% :
mtcars %>% get_dupes(wt)

# You can use tidyselect helpers to specify variables:
mtcars %>% get_dupes(-c(wt, qsec))
mtcars %>% get_dupes(starts_with("cy"))

Run the code above in your browser using DataLab