Learn R Programming

pcaPP (version 2.0-5)

cor.fk: Fast estimation of Kendall's tau rank correlation coefficient

Description

Calculates Kendall's tau rank correlation coefficient in O (n log (n)) rather than O (n^2) as in the current R implementation.

Usage


cor.fk (x, y = NULL)

Value

The estimated correlation coefficient.

Arguments

x

A vector, a matrix or a data frame of data.

y

A vector of data.

Author

David Simcha, Heinrich Fritz, Christophe Croux, Peter Filzmoser <P.Filzmoser@tuwien.ac.at>

Details

The code of this implementation of the fast Kendall's tau correlation algorithm has originally been published by David Simcha. Due to it's runtime (O(n log n)) it's essentially faster than the current R implementation (O (n\^2)), especially for large numbers of observations. The algorithm goes back to Knight (1966) and has been described more detailed by Abrevaya (1999) and Christensen (2005).

References

Knight, W. R. (1966). A Computer Method for Calculating Kendall's Tau with Ungrouped Data. Journal of the American Statistical Association, 314(61) Part 1, 436-439.
Christensen D. (2005). Fast algorithms for the calculation of Kendall's Tau. Journal of Computational Statistics 20, 51-62.
Abrevaya J. (1999). Computation of the Maximum Rank Correlation Estimator. Economic Letters 62, 279-285.

See Also

Examples

Run this code

  set.seed (100)		## creating test data
  n <- 1000
  x <- rnorm (n)
  y <- x+  rnorm (n)

  tim <- proc.time ()[1]	## applying cor.fk
  cor.fk (x, y)
  cat ("cor.fk runtime [s]:", proc.time ()[1] - tim, "(n =", length (x), ")\n")

  tim <- proc.time ()[1]	## applying cor (standard R implementation)
  cor (x, y, method = "kendall")
  cat ("cor runtime [s]:", proc.time ()[1] - tim, "(n =", length (x), ")\n")

		##	applying cor and cor.fk on data containing				

  Xt <- cbind (c (x, as.integer (x)), c (y, as.integer (y)))

  tim <- proc.time ()[1]	## applying cor.fk
  cor.fk (Xt)
  cat ("cor.fk runtime [s]:", proc.time ()[1] - tim, "(n =", nrow (Xt), ")\n")

  tim <- proc.time ()[1]	## applying cor (standard R implementation)
  cor (Xt, method = "kendall")
  cat ("cor runtime [s]:", proc.time ()[1] - tim, "(n =", nrow (Xt), ")\n")

Run the code above in your browser using DataLab