A fast and accurate algorithm for finding approximate k-nearest neighbors.
randomProjectionTreeSearch(x, K = 150, n_trees = 50,
tree_threshold = max(10, nrow(x)), max_iter = 1,
distance_method = "Euclidean", seed = NULL, threads = NULL,
verbose = getOption("verbose", TRUE))# S3 method for matrix
randomProjectionTreeSearch(x, K = 150, n_trees = 50,
tree_threshold = max(10, nrow(x)), max_iter = 1,
distance_method = "Euclidean", seed = NULL, threads = NULL,
verbose = getOption("verbose", TRUE))
# S3 method for CsparseMatrix
randomProjectionTreeSearch(x, K = 150, n_trees = 50,
tree_threshold = max(10, nrow(x)), max_iter = 1,
distance_method = "Euclidean", seed = NULL, threads = NULL,
verbose = getOption("verbose", TRUE))
# S3 method for TsparseMatrix
randomProjectionTreeSearch(x, K = 150, n_trees = 50,
tree_threshold = max(10, nrow(x)), max_iter = 1,
distance_method = "Euclidean", seed = NULL, threads = NULL,
verbose = getOption("verbose", TRUE))
A (potentially sparse) matrix, where examples are columnns and features are rows.
How many nearest neighbors to seek for each node.
The number of trees to build.
The threshold for creating a new branch. The paper authors suggest using a value equivalent to the number of features in the input set.
Number of iterations in the neighborhood exploration phase.
One of "Euclidean" or "Cosine."
Random seed passed to the C++ functions. If seed
is not NULL
(the default),
the maximum number of threads will be set to 1 in phases that would be non-determinstic otherwise.
The maximum number of threads to spawn. Determined automatically if NULL
(the default).
Whether to print verbose logging using the progress
package.
A [K, N] matrix of the approximate K nearest neighbors for each vertex.
Note that the algorithm does not guarantee that it will find K neighbors for each node. A
warning will be issued if it finds fewer neighbors than requested. If the input data contains
distinct partitionable clusters, try increasing the tree_threshold
to increase the number
of returned neighbors.