Learn R Programming

snpar (version 1.0)

KS.test: Kolmogorov-Smirnov Test

Description

Perform a Kolmogorov-Smirnov test for one sample or two samples using kernel method.

Usage

KS.test(x, y, ..., kernel = c("epan", "unif", "tria", 
        "quar", "triw", "tric", "gaus", "cos"), hx, hy, 
        alternative = c("two.sided", "less", "greater"))

Arguments

x
a numeric vector of data values.
y
either a numeric vector of data values, or a character string naming a cumulative distribution function or an actual cumulative distribution function such as "pnorm". Only continuous CDFs are valid.
...
parameters of the distribution specified (as a character string) by y.
kernel
a character string which determines the smoothing kernel function. TThis must be one of "unif" (uniform), "tria" (triangular), "epan" (epanechnikov), "quar" (quartic), "triw" (triweight),
hx
the smoothing bandwidth for x. See 'Details' of the default bandwidth.
hy
the smoothing bandwidth for y. See 'Details' of the default bandwidth.
alternative
indicates the alternative hypothesis and must be one of "two.sided" (default), "less", or "greater".

Value

  • A list with class "htest" containing the following components:
  • data.namea character string giving the name(s) of the data.
  • statisticthe value of the test statistic.
  • p.valuethe p-value of the test.
  • methoda character string indicating what type of test was performed.
  • alternativea character string describing the alternative hypothesis.

Warning

The smoothing bandwidth is always a critical issue in non-parametric statistics. The default smoothing bandwidth suggested by Wang, Cheng and Yang (2013) may not perform well. This only gives the initial bandwidth in some cases. You are recommended to provide one obtained by other methods.

Details

The traditional Kolmogorov-Smirnov test is based on the empirical cumulative distribution function (CDF) which is not continuous and may not provide good estimations to the true CDF. However, the CDF estimated by kernel method overcomes this shortcoming and generally performs much better than the empirical CDF. Namely, the kernel CDF is closer to the true CDF than the empirical CDF. Therefore, applying the kernel CDF is more reasonable than using the empirical CDF in Kolmogorov-Smirnov test. The test statistic is defined as the maximum difference in value and depends on the form of the alternative hypothesis. When the sample size is large, the test statistic has the following Kolmogorov-Smirnov distribution function: $$K(x) = \sum(-1)^(j)*exp{-2*j^2*x^2}, j = - inf, ..., inf, x \ge 0,$$ and $K(x) = 0, x < 0$. See Conover, W. J. (1999) for more details. The default smoothing bandwidth is the plug-in optimal bandwidth used in Wang, Cheng and Yang (2013). Missing values have been removed.

References

Conover, W. J. (1999). Practical Nonparameteric Statistics (Third Edition ed.). Wiley. pp. 396-406. Wang, J., Cheng, F. and Yang, L. (2013). Smooth simultaneous confidence bands for cumulative distribution functions. Journal of Nonparametric Statistics. 25, 395-407.

See Also

ks.test

Examples

Run this code
# one-sample Kolmogorov-Smirnov test
x <- rnorm(100,2,3)
KS.test(x, "pnorm", 2, 3)

# two-sample Kolmogorov-Smirnov test
y <- rgamma(100,1,6)
KS.test(x,y)

Run the code above in your browser using DataLab