Quantiles and percentiles are often confusing statistics terms.
In data analysis, quantiles and percentiles are used for describing the distribution of data, as well as determining spread, relative position, and central tendency.
The key differences between quantiles and percentiles are:
- Quantiles divide the dataset into any number of equal parts.
- Quartiles and percentiles are parts of quantiles.
- For example, quartiles and percentiles split the data into 4 and 100 equal parts
- Quantiles are typically expressed as decimal values and range from 0 to 1 (e.g., 0.25, 0.5).
- The 0.25 quantile is a value below which contains the 25% of the data falls.
In python, quantiles can be calculated using the quantile() function from the NumPy
The following example shows how to calculate the quantiles in Python.
# import package import numpy as np # dataset x = [15, 10, 15, 25, 25, 30, 35, 45, 45, 50, 55, 65] # calculate 0.5 quantile np.quantile(x, [0.5]) # output array([32.5])
The value of 0.5 quantile is 32.5. This indicates that 50% of the data falls below the value of 32.5.
- Percentiles divide the data into 100 equal parts.
- Percentiles are typically expressed as whole numbers and range from 0 to 100.
- The 25th percentile is equivalent to the 0.25 quantile. Similarly, the 75th percentile is equivalent to the 0.75 quantile.
- The 50th percentile is a value below which the 50% of the data falls.
In python, percentiles can be calculated using the percentile() function from the NumPy
The following example shows how to calculate the percentiles in Python.
# import package import numpy as np # dataset x = [15, 10, 15, 25, 25, 30, 35, 45, 45, 50, 55, 65] # calculate 95th percentile np.percentile(x, ) # output array([59.5])
The value of the 95th percentile is 59.5. It means that 95% of the values in the dataset are below 59.5.
It’s important to note that
np.percentile()return the same value for a given quantile or percentile. However, there is a difference between the two. The
np.quantile()function requires values between 0 and 1 as its second argument, while
np.percentile()requires values between 0 and 100 for its second argument.
Enhance your skills with courses on Statistics and Python
- Introduction to Statistics
- Python for Everybody Specialization
- Python 3 Programming Specialization
- Statistics with Python Specialization
- Advanced Statistics for Data Science Specialization
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.