How to Perform Mann-Whitney U test in R

Renesh Bedre 3 minute read

The Mann-Whitney U test (also known as Wilcoxon rank-sum test) is a non-parametric statistical test used for comparing two independent groups to determine whether two groups significantly differ from each other.

It is recommended to use the Mann-Whitney U test when data for two independent groups does not follow normal distributions.

The Mann-Whitney U test assumes that observations in each group must be independent, variance of the two groups should be roughly equal, and it is applied on ordinal or continuous data that are not normally distributed.

In R, the Mann-Whitney U test is performed using the wilcox.test() function. Here’s the general syntax which looks like this based on the input data:

# when data is two separate vectors
wilcox.test(group1, group2)

# when data is single stacked table
wilcox.test(response ~ groups, data = df)

Where, response is a variable with outcome values and groups is a variable which contains the two independent groups.

Note: A Mann-Whitney U test is non-parametric equivalent to an independent two-sample t-test, but it is less powerful (higher Type II error rate) than t-test.

Example of Mann-Whitney U test in R

The following examples explain how to perform the Mann-Whitney U test in R.

Suppose, there are two plant genotypes (A and B) differing in their height. We would like to check whether the heights of two plant genotypes are significantly differ from each other.

Sample size: Mann-Whitney U test can be applied on small (5-20) samples, and the power of the test increases as the sample size increases.

Load the dataset and check the normality of the variables using Shapiro-Wilk normality test,

# import dataset
df = read.csv("https://reneshbedre.github.io/assets/posts/mann_whitney/genotype_height.csv")

# view first few data
head(df)
    genotype height
1 genotype_A     25
2 genotype_A     30
3 genotype_A     30
4 genotype_A     25
5 genotype_A     25
6 genotype_A     20

# Shapiro-Wilk normality test
genotype_A  = df[df$genotype == "genotype_A", ]$height   
genotype_B  = df[df$genotype == "genotype_B", ]$height  

shapiro.test(genotype_A)

# output
data:  x
W = 0.88481, p-value = 0.0104

shapiro.test(genotype_B)

# output
data:  genotype_B
W = 0.86168, p-value = 0.005501

The p value obtained from Shapiro-Wilk test is lesser than significance level of 0.05 for both the groups. Hence, we conclude that the data for each group are not normally distributed.

Now, perform the Mann-Whitney U test using wilcox.test() function,

wilcox.test(height ~ genotype, data = df)

# output
data:  height by genotype
W = 520.5, p-value = 1.414e-08
alternative hypothesis: true location shift is not equal to 0

As the p value from Mann-Whitney U test is less than significance level of 0.05 (W = 520.5, p = 1.414e-08), we can conclude that there is a significant difference in height between the genotype_A and genotype_B.

Note: By default, wilcox.test() performs the two-sided test. The one-sided Mann-Whitney U test can be performed by specifying the alternative argument to wilcox.test() function.

For instance, if you want to test whether, the height of genotype_A is higher than genotype_B, you can perform the one-sided Mann-Whitney U test using alternative = "greater" argument.

wilcox.test(height ~ genotype, data = df, alternative = "greater")

# output
data:  height by genotype
W = 520.5, p-value = 7.068e-09
alternative hypothesis: true location shift is greater than 0

As the p value from one-sided Mann-Whitney U test is less than significance level of 0.05 (W = 520.5, p = 7.068e-09), we can conclude that the height of genotype_A (median = 25) is significantly higher than the genotype_B (median = 11.5).

The other alternative to Mann-Whitney U test includes Kruskal-Wallis test, Friedman test, and Wilcoxon Signed-Rank Test

Enhance your skills with courses on Statistics and R

This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.

Share on

Twitter Facebook LinkedIn

How to Perform Mann-Whitney U test in R

Example of Mann-Whitney U test in R

Enhance your skills with courses on Statistics and R

Share on

You may also enjoy

Calculate Coverage From BAM File

Python: Why VIF Return Inf Value?

Find Max and Min Sequence Length in Fasta

Get Non-overlapping Portion Between Two Regions in bedtools