Kruskal-Wallis (KW) test
- Kruskal-Wallis test (also known as Kruskal-Wallis H test or Kruskal–Wallis ANOVA) is a non-parametric (distribution free) alternative to the one-way ANOVA.
- Kruskal-Wallis test is useful when the assumptions of ANOVA are not met or there is a significant deviation from the ANOVA assumptions. If the data meets the ANOVA assumptions, it is better to use ANOVA as it is a little more powerful than non-parametric tests.
- Kruskal-Wallis test used for comparing the differences between two or more groups. It is an extension to the Mann Whitney U Test, which is used for comparing two groups. It compares the mean ranks (medians) of groups.
- Kruskal-Wallis test does not assume any specific distribution (such as normal distribution of samples) for calculating test statistics and p values.
- The sample mean ranks or medians are compared in the Kruskal-Wallis test, which distinguishes it from the ANOVA, which compares sample means. Medians are less sensitive to outliers than means.
Kruskal-Wallis test assumptions
- The independent variable should have two or more independent groups
- The observations from the independent groups should be randomly selected from the target populations
- Observations are sampled independently from each other (no relation in observations between the groups and within the groups) i.e., each subject should have only one response
- The dependent variable should be continuous or ordinal (e.g. Likert item data)
Kruskal-Wallis test Hypotheses
If each group distribution is not same,
Null hypothesis: All group mean ranks are equal
Alternative hypothesis: At least, one group mean rank different from other groups
In terms of medians (when each group distribution is same),
Null hypothesis: Populations medians are equal
Alternative hypothesis: At least, one population median is different from other populations
Learn more about hypothesis testing and interpretation
Kruskal-Wallis test formula
Kruskal-Wallis test statistics (H) is given as,
H is approximately chi-squared distributed with k-1 degress of freedom
The p value is calculated based on the comparison between the critical value and the H value. If H >= critical value, we reject the null hypothesis and vice versa.
As the Kruskal-Wallis test is based on the chi-squared distribution, the sample size for each group should be at least five.
Perform Kruskall-Wallis test in R
Get example dataset
Assume we have three plant varieties, each with a different yield value. We need to see whether there are any significant differences in yield across the three plant varieties in this dataset.
Kruskal-Wallis test can be performed on a dataset with unequal sample size in each group
Load and visualize dataset,
library(tidyverse) df = read.csv("https://reneshbedre.github.io/assets/posts/mann_whitney/genotype_kw.csv") head(df, 2) plant_var yield 1 A 70 2 A 20
df %>% group_by(plant_var) %>% summarise(n = n(), mean = mean(yield), sd = sd(yield)) # output plant_var n mean sd <chr> <int> <dbl> <dbl> 1 A 5 35 27.8 2 B 5 90 0 3 C 5 20 0
Generate boxplot to check data spread
ggplot(df, aes(x = plant_var, y = yield, col = plant_var)) + geom_boxplot(outlier.shape = NA) + geom_jitter(width = 0.2) + theme(legend.position="top")
Check data distribution
Check data distribution and normality assumptions using Shapiro-Wilk test and histogram ,
shapiro.test(df$yield) # output Shapiro-Wilk normality test data: df$yield W = 0.75495, p-value = 0.001027 # plot histogram ggplot(df, aes(x = yield)) + geom_histogram(aes(y=..density..), colour="black", fill="skyblue", binwidth = 10) + geom_density(color="red", size=1)
Check homogeneity of variances assumption,
library(car) leveneTest(yield ~ plant_var, data=df) Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 2 4.3663 0.0376 * 12
As the p value obtained from the Shapiro-Wilk test and Levene’s test is significant (p < 0.05), we conclude that the data is not normally distributed and does not have equal variance. Further, in histogram data distribution shape does not look normal. Therefore, the parametric test ANOVA may not be appropriate here. Kruskal-Wallis test is more appropriate for analyzing differences among three plant varieties.
Perform Kruskal-Wallis test
kruskal.test(yield ~ plant_var, data = df) # output Kruskal-Wallis rank sum test data: yield by plant_var Kruskal-Wallis chi-squared = 10.396, df = 2, p-value = 0.005527 # calculate effect size library(rcompanion) epsilonSquared(x = df$yield, g = df$plant_var) # output epsilon.squared 0.743
As the p value obtained from the Kruskal-Wallis test test is significant (H (2) = 10.39, p < 0.05), we conclude that there are significant differences in yield among the plant varieties.
For the Kruskal-Wallis test, epsilon-squared is a method of choice for effect size measurement. The epsilon-squared is 0.74 and suggests a very strong effect of plant varieties on yield.
epsilon-squared > 0.64 suggest very strong effect. Read more about epsilon-squared scale
Kruskal-Wallis test is an omnibus test statistics, which indicates that there are significant differences in yield among the plant varieties, but does not tell which plant varieties are different from each other.
To know which plant varieties are significantly different from each other, we will perform the Dunn’s test as post-hoc test for significant Kruskal-Wallis test. As there are multiple comparisons, we will correct the p values using Benjamini-Hochberg FDR method for multiple hypothesis testing at a 5% cut-off.
The other tests that can be used for post-hoc test includes Conover and Nemenyi tests.
Dunn’s test is more appropriate post-hoc than the Mann-Whitney U test for significant Kruskal-Wallis test as it retains the rank sums of the Kruskal-Wallis.
library(FSA) dunnTest(yield ~ plant_var, data = df, method = "bh") # output Dunn (1964) Kruskal-Wallis multiple comparison p-values adjusted with the Benjamini-Hochberg method. Comparison Z P.unadj P.adj 1 A - B -2.792316 0.005233219 0.007849829 2 A - C 0.000000 1.000000000 1.000000000 3 B - C 2.792316 0.005233219 0.015699657
Let’s check the results with additional Conover and Nemenyi tests post-hoc tests,
library(PMCMRplus) df$plant_var = as.factor(df$plant_var) # Tukey's p-adjustment (single-step method) kwAllPairsConoverTest(yield ~ plant_var, data = df) # output Pairwise comparisons using Conover all-pairs test data: yield by plant_var A B B 0.00071 - C 1.00000 0.00071 P value adjustment method: single-step
# Nemenyi test # Tukey's p-adjustment (single-step method) kwAllPairsNemenyiTest(yield ~ plant_var, data = df) # output Pairwise comparisons using Tukey-Kramer-Nemenyi all-pairs test with Tukey-Dist approximation data: yield by plant_var A B B 0.022 - C 1.000 0.022 P value adjustment method: single-step
The post-hoc test using the Dunn test indicates that there are significant differences in yield between A and B (p adjusted < 0.01) and B and C (p adjusted < 0.05) plant varieties. It appears that the B plant variety has a significantly higher yield than the A and C plant varieties.
Enhance your skills with statistical courses using R
- Hecke TV. Power study of anova versus Kruskal-Wallis test. Journal of Statistics and Management Systems. 2012 May 1;15(2-3):241-7.
- Tests with More than Two Independent Samples
- Hazra A, Gogtay N. Biostatistics series module 3: comparing groups: numerical variables. Indian journal of dermatology. 2016 May;61(3):251.
- Dinno A. Nonparametric pairwise multiple comparisons in independent groups using Dunn’s test. The Stata Journal. 2015 Apr;15(1):292-300.
If you have any questions, comments or recommendations, please email me at email@example.com
If you enhanced your knowledge and practical skills from this article, consider supporting me on
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.