Find probabilities using discrete and continuous probability distributions

Renesh Bedre    2 minute read

What is Probability Distributions?

  • Probability distributions represent the probabilities associated with all outcomes of a random variable.
  • Depending on the type of random variable - discrete or continuous - probability distributions classified as discrete and continuous probability distributions.

Discrete probability distribution

  • Discrete probability distributions explain the probabilities associated with each possible outcome of a discrete random variable (countable quantity such as 0, 1, 2, and so on and not fractions, e.g. number of apples).
  • The probability of each observation of discrete random variable lies between 0 and 1, and the sum of probabilities of all observations is 1.
  • Binomial and Poisson distributions are a discrete probability distribution
  • For example, a restaurant sells 10 to 20 pizzas during lunch hour, and Table 1 represents the discrete probability distribution of pizza sell. A random variable (X) takes all possible discrete values between 10 and 20. p(X=x) or p(x) represents the probability of each value of pizza sell.
x 10 11 12 13 14 15 16 17 18 19 20
p(x) 0.07 0.09 0.11 0.12 0.16 0.14 0.10 0.09 0.06 0.03 0.03

Table 1: Probability distribution of pizza sells

Graphically, it can be shown as,

Discrete probability distribution

Figure 1: Probability distribution of pizza sells

Probability mass function (PMF) and cumulative distribution function (CDF)

  • The probability mass function (PMF) is a distribution of the probability of each possible value (x) of X. For example, p(X=12) is 0.11, which is the PMF of X evaluated at 12.
  • Similar to PMF, the cumulative distribution function (CDF) is a cumulative probability of at most x’s values of X. For example, p(X<=12) is 0.27, which is a cumulative probability of p(X=10), p(X=11), and p(X=12).

Continuous probability distribution

  • Continuous probability distributions explain the probabilities associated with each possible outcome of a continuous random variable (infinite and uncountable quantity such as any values in a specified range, e.g. time spent on reading a blog page).
  • The probability of each observation of continuous random variable that lies in between two values (a and b) is the area under the curve between a and b (see shaded area in Figure 2).
  • For a continuous random variable, a probability density function (PDF) is used for calculating the probability for an interval between the two values (a and b) of X. The probability p(a ≤ x ≤ b) of any value between the a and b is equal to the area under the curve of a and b. The total area under the curve is always equal to one.
  • Generally, the probability of interval is calculated in continuous probability distributions because the probability that X takes any single value is always zero.
  • Similar to PDF, cumulative distribution function (CDF) is used for calculating the probability for all values of X which are less than or equal to some value p(X ≤ x ).
  • The normal distribution, exponential distribution, and uniform distribution are continuous probability distributions

Let's take an example, a daily time spent on reading a blog page is approximately normally distributed with a mean of 3 minutes and a standard deviation of 0.5.

The shaded area in Figure 2 represents the probability that the time spent on reading a blog page in between 3 to 4 minutes i.e. p(3 ≤ x ≤ 4).

Discrete probability distribution

Figure 2: Normal distribution time spent on reading a blog page

References

  • https://tinyurl.com/yydhju4g
  • https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html
  • https://amsi.org.au/ESA_Senior_Years/PDF/ContProbDist4e.pdf

This work is licensed under a Creative Commons Attribution 4.0 International License