Learn R Programming

rchemo (version 0.1-3)

checkdupl: Duplicated rows in datasets

Description

Finding and removing duplicated row observations in datasets.

Usage

checkdupl(X, Y = NULL, digits = NULL)

Value

a dataframe with the row numbers in the first and second datasets that are identical, and the values of the variables.

Arguments

X

A dataset.

Y

A dataset compared to X.

digits

The number of digits when rounding the data before the duplication test. Default to NULL (no rounding.

Examples

Run this code

X1 <- matrix(c(1:5, 1:5, c(1, 2, 7, 4, 8)), nrow = 3, byrow = TRUE)
dimnames(X1) <- list(1:3, c("v1", "v2", "v3", "v4", "v5"))

X2 <- matrix(c(6:10, 1:5, c(1, 2, 7, 6, 12)), nrow = 3, byrow = TRUE)
dimnames(X2) <- list(1:3, c("v1", "v2", "v3", "v4", "v5"))

X1
X2

checkdupl(X1, X2)

checkdupl(X1)

checkdupl(matrix(rnorm(20), nrow = 5))

res <- checkdupl(X1)
s <- unique(res$rownum2)
zX1 <- X1[-s, ]
zX1

Run the code above in your browser using DataLab