How to Perform One-Way ANOVA in R (With Example Dataset)

Renesh Bedre 3 minute read

The one-way ANOVA (Analysis of Variance) is used for determining statistical differences in more than two groups by comparing their group means.

The one-way ANOVA is also known as one-factor ANOVA as there is only one independent variable (factor or group variable) to analyze.

A one-way ANOVA tests the null hypothesis that group means are equal against the alternative hypothesis that group means are not equal (i.e. there is a significant difference between at least one group and the others).

You can use following code to perform one-way ANOVA in R:

# model
model <- aov(y ~ x, data = df)

# view ANOVA summary
summary(model)

Where,

Parameter	Description
`y`	Response variable (should be continuous variable)
`x`	Group variable
`df`	Data frame containing the group and response variable

The following example illustrates how to use one-way ANOVA for analyzing the group differences.

How to Perform One-Way ANOVA in R

For example, a researcher wants to analyze whether plant height differs among plant genotypes. The researcher collects plant height data for four plant genotypes.

The researcher have following Null and Alternative hypotheses:

Null Hypothesis: The plant height is equal among plant genotypes i.e. the mean of plant height is equal
Alternative hypothesis: The plant height is not equal among plant genotypes i.e. the mean of plant height is significantly different

Here, the alternative hypothesis is two-side as the plant height can be lesser or greater in one plant genotype than in another genotypes.

The following ANOVA code shows how to perform one-way ANOVA in R:

Load and view the dataset,

# load dataset
df <- read.csv("https://reneshbedre.github.io/assets/posts/anova/one_way_anova.csv")

# view five rows of data frame
head(df)

  genotype height
1        A      5
2        A      6
3        A      7
4        A      8
5        A      8
6        B     12

Check descriptive statistics (mean and variance) for each plant genotype,

# load package
library(dplyr)

# get descriptive statistics
df  %>% group_by(genotype) %>% summarise(mean = mean(height), var = var(height))

# A tibble: 4 × 3
  genotype  mean   var
  <fct>    <dbl> <dbl>
1 A          6.8   1.7
2 B         13.6   2.3
3 C          7     3.5
4 D          7.2   1.7

From the descriptive statistics, we can see that plant height is highest for genotype B and lowest for genotype A. The variance is a roughly similar for all genotypes.

Now, we will perform a one-way ANOVA to check whether these differences in plant height are statistically significant.

Perform a one-way ANOVA and summarise the results using summary() function,

# fit model
model <- aov(height ~ genotype, data = df)

# summary statistics
summary(model)

            Df Sum Sq Mean Sq F value   Pr(>F)    
genotype     3  163.8   54.58   23.73 3.93e-06 ***
Residuals   16   36.8    2.30                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

The one-way ANOVA analysis reports the following important statistics for interpretation,

Parameter	Value
F	23.73
p value	3.93e-06
Degree of freedom	3 and 16

According to the one-way ANOVA results, the p value is significant [F(3, 16) = 23.73, p < 0.05]. Hence, we reject the null hypothesis and conclude that plant height among genotypes is significantly different.

Relevant article

How to perform one-way ANOVA in Python

Enhance your skills with courses on Statistics and R

This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.

Share on

Twitter Facebook LinkedIn

How to Perform One-Way ANOVA in R (With Example Dataset)

How to Perform One-Way ANOVA in R

Relevant article

Enhance your skills with courses on Statistics and R

Share on

You may also enjoy

Calculate Coverage From BAM File

Python: Why VIF Return Inf Value?

Find Max and Min Sequence Length in Fasta

Get Non-overlapping Portion Between Two Regions in bedtools