`summary()`

Function in R: How to Use (With 6 Examples)?

The `summary()`

is a base function in R which
is useful for getting the detailed statistical summary of the fitted model (ANOVA, regression, etc.), data frame, vector, matrix,
and factor.

For example, in the case of the fitted regression model, the `summary()`

function returns the model equation, regression
coefficients, residuals, *F* statistics, *p* value, and R-Squared.

The basic syntax for the `summary()`

function is,

```
summary(object)
```

In above syntax, the `object`

could be fitted model, data frame, data frame columns, matrix, or vector.

The following six example illustrates how to use a `summary()`

function to summarise the results for various objects.

## 1. Summary statistics for the regression model

`summary()`

function is a popular and widely used for summarising the statistical results obtained from the fitted
regression model.

The following example shows how to use the `lm()`

function to fit the linear regression model
and `summary()`

function to summarise the statistical results.

```
# load blood pressure example dataset
df <- read.csv("https://reneshbedre.github.io/assets/posts/reg/bp.csv")
# fit simple linear regression
model <- lm(BP ~ Age, data = df)
# get summary statistics
summary(model)
Call:
lm(formula = BP ~ Age, data = df)
Residuals:
Min 1Q Median 3Q Max
-6.7104 -2.9217 0.4276 2.3973 7.8586
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 44.4545 18.7277 2.374 0.02894 *
Age 1.4310 0.3849 3.718 0.00157 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.195 on 18 degrees of freedom
Multiple R-squared: 0.4344, Adjusted R-squared: 0.403
F-statistic: 13.82 on 1 and 18 DF, p-value: 0.001574
```

In the regression model, the `summary()`

function returns residuals, regression coefficients, performance metrics (R-Squared),
and statistical significance of regression such as *F* statistics and *p* value.

In addition to `summary()`

, you can also use `summary.lm()`

to get similar results.

## 2. Summary statistics for the ANOVA model

When you run ANOVA in R, the `summary()`

function is used for summarising the statistical results from the ANOVA model.

The following example shows how to use the `aov()`

function to fit the ANOVA model
and the `summary()`

function to summarise the statistical results.

```
# load dataset
df <- read.csv("https://reneshbedre.github.io/assets/posts/anova/anova.csv")
# fit one-way ANOVA
model <- aov(response ~ treatment, data = df)
# get summary statistics
summary(model)
Df Sum Sq Mean Sq F value Pr(>F)
treatment 3 3011 1003.6 17.49 2.64e-05 ***
Residuals 16 918 57.4
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

In ANOVA, the `summary()`

function returns an ANOVA table that contains the degree of freedom for treatment, residuals
(experimental error), and statistical significance of ANOVA such as *F* statistics and *p* value.

In addition to `summary()`

, you can also use `summary.lm()`

on the ANOVA model which returns detailed summary
statistics for each treatment group.

## 3. Summary statistics for data frame

The `summary()`

function could be used for getting descriptive statistics such as mean, median, and quartiles for all
or specific columns of a R data frame.

If you want descriptive statistics for additional parameters such as standard error (se), standard deviation (sd),
sample count, trimmed mean, etc., you should use `describe()`

function.

Get descriptive statistics for all columns,

```
# load dataset
df <- read.csv("https://reneshbedre.github.io/assets/posts/anova/anova.csv")
# get summary statistics
summary(df)
treatment response
Length:20 Min. :25.00
Class :character 1st Qu.:29.00
Mode :character Median :36.50
Mean :41.45
3rd Qu.:54.25
Max. :73.00
```

For a numeric variable, the `summary()`

function returns the statistical summary for minimum, first quartile
(25th percentile), median, mean, third quartile (75th percentile), and maximum value.

Now let’s check how to get descriptive statistics for a specific column,

```
# load dataset
df <- read.csv("https://reneshbedre.github.io/assets/posts/anova/anova.csv")
# get summary statistics for response variable
summary(df$response)
Min. 1st Qu. Median Mean 3rd Qu. Max.
25.00 29.00 36.50 41.45 54.25 73.00
```

## 4. Summary statistics for factor

The `summary()`

function could be used for getting the frequency of the character variable. The character variable
should be formatted as a factor.

Get a summary from a character variable,

```
# load dataset
df <- read.csv("https://reneshbedre.github.io/assets/posts/anova/anova.csv")
# get summary of character variable
summary(as.factor(df$treatment))
A B C D
5 5 5 5
```

For a factor, the `summary()`

function returns the frequency of each factor or group.

## 5. Summary statistics for vector

For a numerical vector, the `summary()`

function returns the descriptive statistical summary.

```
# create random numeric vector
x <- c(1, 0.5, 3, 4.5, 3, 2)
# summary
summary(x)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.500 1.250 2.500 2.333 3.000 4.500
```

Note: The summary function drops NA values while providing a statistical summary on a numeric vector.

For a character vector, the `summary()`

function returns the frequency of the character. The character vector should be
formatted as a factor.

```
# create random character vector
x <- c("A", "B", "A", "C", "A", "B")
# summary
summary(as.factor(x))
A B C
3 2 1
```

## 6. Summary statistics for matrix

Similar to a data frame, the `summary()`

function returns a descriptive summary statistics for each column of the matrix.

If you convert a data frame to the matrix, the factor columns (characters) are converted to integer values.

```
# load dataset
df <- read.csv("https://reneshbedre.github.io/assets/posts/anova/anova.csv")
# convert to matrix
df_mat = data.matrix(df)
# get summary statistics
summary(df_mat)
treatment response
Min. :1.00 Min. :25.00
1st Qu.:1.75 1st Qu.:29.00
Median :2.50 Median :36.50
Mean :2.50 Mean :41.45
3rd Qu.:3.25 3rd Qu.:54.25
Max. :4.00 Max. :73.00
```

## Enhance your skills with statistical courses using R

- Statistics with R Specialization
- Data Science: Foundations using R Specialization
- Data Analysis with R Specialization
- Understanding Clinical Research: Behind the Statistics
- Introduction to Statistics
- R Programming
- Getting Started with Rstudio

This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.