Learn R Programming

gower (version 1.0.1)

gower_topn: Find the top-n matches

Description

Find the top-n matches in y for each record in x.

Usage

gower_topn(
  x,
  y,
  pair_x = NULL,
  pair_y = NULL,
  n = 5,
  eps = 1e-08,
  weights = NULL,
  ignore_case = FALSE,
  nthread = getOption("gd_num_thread")
)

Value

A list with two array elements: index

and distance. Both have size n X nrow(x). Each ith column corresponds to the top-n best matches of x with rows in y. When there are no columns to compare, a message is printed and both

distance and index will be empty matrices; the list is then returned invisibly.

Arguments

x

[data.frame]

y

[data.frame]

pair_x

[numeric|character] (optional) Columns in x used for comparison. See Details below.

pair_y

[numeric|character] (optional) Columns in y used for comparison. See Details below.

n

The top-n indices and distances to return.

eps

[numeric] (optional) Computed numbers (variable ranges) smaller than eps are treated as zero.

weights

[numeric] (optional) A vector of weights of length ncol(x) that defines the weight applied to each component of the gower distance.

ignore_case

[logical] Toggle ignore case when neither pair_x nor pair_y are user-defined.

nthread

Number of threads to use for parallelization. By default, for a dual-core machine, 2 threads are used. For any other machine n-1 cores are used so your machine doesn't freeze during a big computation. The maximum nr of threads are determined using omp_get_max_threads at C level.

See Also

gower_dist

Examples

Run this code
# find the top 4 best matches in the iris data set with itself.
x <- iris[1:3,]
lookup <- iris[1:10,]
gower_topn(x=x,y=lookup,n=4)


Run the code above in your browser using DataLab