mergeV: Verbose Merge

Description

A verbose wrapper for the merge function from the base package.

Usage

mergeV(x, y,  by = intersect(names(x), names(y)), by.x = by, by.y = by, all = FALSE, all.x = all, all.y = all, verbose = TRUE, ...)

Arguments

x, y

data frames, or objects to be coerced to one.

by, by.x, by.y

specifications of the columns used for merging.

all

logical; all = L is shorthand for all.x = L and all.y = L, where L is either TRUE or FALSE.

all.x

logical; if TRUE, then extra rows will be added to the output, one for each row in x that has no matching row in y. These rows will have NAs in those columns that are usually filled with values from y. The default is FALSE, so that only rows with data from both x and y are included in the output.

all.y

logical; analogous to all.x.

verbose

prints information about the merge. See 'Details'.

...

additional arguments passed to merge.

Value

Same value as merge.

Details

This function is just a wrapper for merge, the behavior and the return value are the same.

Calculation of the printed information is not computationaly free; mergeV takes significantly longer than the non verbose version to compute. Cross joins have almost no overhead, and inner joins have less overhead than the other types of junctions.

When verbose is FALSE the computational overhead is removed.

The printed information is composed of 3 parts. The first part represents the number of lines from each entry tables X and Y who have matches, and the number of resulting lines in the output R. The second part gives the type of join (inner, outer, left, right and cross). The last part is only there if there is a by variable, and gives the number of nXn matches on the by variables.

Examples

Run this code

# classical merge example
authors <- data.frame(
    surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
    nationality = c("US", "Australia", "US", "UK", "Australia"),
    deceased = c("yes", rep("no", 4)))
books <- data.frame(
    name = I(c("Tukey", "Venables", "Tierney",
             "Ripley", "Ripley", "McNeil", "R Core")),
    title = c("Exploratory Data Analysis",
              "Modern Applied Statistics ...",
              "LISP-STAT",
              "Spatial Statistics", "Stochastic Simulation",
              "Interactive Data Analysis",
              "An Introduction to R"),
    other.author = c(NA, "Ripley", NA, NA, NA, NA,
                     "Venables & Smith"))

mergeV(authors, books, by.x = "surname", by.y = "name")
mergeV(authors, books, by.x = "surname", by.y = "name", all = TRUE)