# Two-sample Kolmogorov-Smirnov test in R

The two-sample Kolmogorov-Smirnov test is used for comparing two independent samples to determine whether they come from the same distribution.

Two-sample Kolmogorov-Smirnov test checks the null hypothesis that two independent samples comes from same continuous probability distribution against the alternative hypothesis that two independent samples does not come from same continuous probability distribution.

In R, you can perform two-sample Kolmogorov-Smirnov test using built-in ks.test() function.

The general syntax of ks.test() looks like this:

# two-sample Kolmogorov-Smirnov test
ks.test(x, y)

Where, x and y are two independent sample datasets

Note: The Kolmogorov-Smirnov test is only valid for the continuous distribution

The following examples demonstrate how to perform two-sample Kolmogorov-Smirnov test in R

## Example 1

Suppose we have a two datasets that follows a normal distribution,

# generate random dataset
x = rnorm(50)
y = rnorm(50)

Now, check whether datasets x and y comes from a same distribution using a two-sample Kolmogorov-Smirnov test.

# two-sample Kolmogorov-Smirnov test
ks.test(x, y)

# output
Exact two-sample Kolmogorov-Smirnov test

data:  x and y
D = 0.12, p-value = 0.8693
alternative hypothesis: two-sided

As the p value (D = 0.12, p = 0.8693) obtained from the two-sample Kolmogorov-Smirnov test is greater than the significance level (0.05), we fail to reject the null hypothesis and conclude that the two datasets come from the same distribution.

## Example 2

Suppose we have two datasets that come from different distributions,

# generate random dataset from normal distribution
x = rnorm(50)
# generate random dataset from uniform distribution
y = runif(50)

Now, check whether these two datasets come from same distribution using a two-sample Kolmogorov-Smirnov test.

# one-sample Kolmogorov-Smirnov test
ks.test(x, y)

# output
Exact two-sample Kolmogorov-Smirnov test

data:  x and y
D = 0.54, p-value = 4.929e-07
alternative hypothesis: two-sided

As the p value (D = 0.54, p < 0.05) obtained from the two-sample Kolmogorov-Smirnov test is lesser than the significance level (0.05), we reject the null hypothesis and conclude that the two datasets does not come from same distribution.