The one-way ANOVA (Analysis of Variance) is used for determining statistical differences in more than two groups by comparing their group means.
The one-way ANOVA is also known as one-factor ANOVA as there is only one independent variable (factor or group variable) to analyze.
A one-way ANOVA tests the null hypothesis that group means are equal against the alternative hypothesis that group means are not equal (i.e. there is a significant difference between at least one group and the others).
You can use following code to perform one-way ANOVA in R:
# model model <- aov(y ~ x, data = df) # view ANOVA summary summary(model)
||Response variable (should be continuous variable)|
||Data frame containing the group and response variable|
The following example illustrates how to use one-way ANOVA for analyzing the group differences.
How to Perform One-Way ANOVA in R
For example, a researcher wants to analyze whether plant height differs among plant genotypes. The researcher collects plant height data for four plant genotypes.
The researcher have following Null and Alternative hypotheses:
Null Hypothesis: The plant height is equal among plant genotypes i.e. the mean of plant height is equal
Alternative hypothesis: The plant height is not equal among plant genotypes i.e. the mean of plant height is significantly different
Here, the alternative hypothesis is two-side as the plant height can be lesser or greater in one plant genotype than in another genotypes.
The following ANOVA code shows how to perform one-way ANOVA in R:
Load and view the dataset,
# load dataset df <- read.csv("https://reneshbedre.github.io/assets/posts/anova/one_way_anova.csv") # view five rows of data frame head(df) genotype height 1 A 5 2 A 6 3 A 7 4 A 8 5 A 8 6 B 12
Check descriptive statistics (mean and variance) for each plant genotype,
# load package library(dplyr) # get descriptive statistics df %>% group_by(genotype) %>% summarise(mean = mean(height), var = var(height)) # A tibble: 4 × 3 genotype mean var <fct> <dbl> <dbl> 1 A 6.8 1.7 2 B 13.6 2.3 3 C 7 3.5 4 D 7.2 1.7
From the descriptive statistics, we can see that plant height is highest for genotype B and lowest for genotype A. The variance is a roughly similar for all genotypes.
Now, we will perform a one-way ANOVA to check whether these differences in plant height are statistically significant.
Perform a one-way ANOVA and summarise the results using
# fit model model <- aov(height ~ genotype, data = df) # summary statistics summary(model) Df Sum Sq Mean Sq F value Pr(>F) genotype 3 163.8 54.58 23.73 3.93e-06 *** Residuals 16 36.8 2.30 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The one-way ANOVA analysis reports the following important statistics for interpretation,
|Degree of freedom||3 and 16|
According to the one-way ANOVA results, the p value is significant [F(3, 16) = 23.73, p < 0.05]. Hence, we reject the null hypothesis and conclude that plant height among genotypes is significantly different.
Enhance your skills with courses on Statistics and R
- Introduction to Statistics
- R Programming
- Data Science: Foundations using R Specialization
- Data Analysis with R Specialization
- Getting Started with Rstudio
- Applied Data Science with R Specialization
- Statistical Analysis with R for Public Health Specialization
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.