Perform Durbin-Watson (DW) test in Python

Renesh Bedre    2 minute read

Durbin-Watson (DW) test

In regression analysis, Durbin-Watson (DW) is useful for checking the first-order autocorrelation (serial correlation). It analyzes the residuals for independence over time points (autocorrelation). The autocorrelation varies from -1 (negative autocorrelation) to 1 (positive autocorrelation).

Durbin-Watson test analyzes the following hypotheses,

Null hypothesis (H0): Residuals from the regression are not autocorrelated (autocorrelation coefficient, ρ = 0)
Alternative hypothesis (Ha): Residuals from the regression are autocorrelated (autocorrelation coefficient, ρ > 0)

Learn more about hypothesis testing and interpretation

Durbin-Watson test statistics (d) always ranges between 0 and 4. If the value is near 2, it indicates evidence of non-autocorrelation. If the value is towards 0, it indicates evidence of positive autocorrelation. If the value is towards 4, it indicates evidence of negative autocorrelation

Perform Durbin-Watson test in Python

We will use the statsmodels package to perform Durbin-Watson test

Get dataset

Suppose, there is a hypothetical time-series dataset of stock prices recorded over 12 months.

import pandas as pd
df = pd.read_csv("https://reneshbedre.github.io/assets/posts/reg/stock_price.csv")
df.head(2)
# output
   months  stock_price
0       1          122
1       2          129

Fit the regression model

To perform the Durbin-Watson test, we first need to get regression residuals. Fit the regression model with months as independent variables and stock_price as the dependent variable,

import statsmodels.api as sm
X = df['months']   # independent variable
y = df['stock_price']   # dependent variable
# to get intercept
X = sm.add_constant(X)
# fit the regression model
reg = sm.OLS(y, X).fit()
reg.summary()

# output
                            OLS Regression Results
==============================================================================
Dep. Variable:            stock_price   R-squared:                       0.892
Model:                            OLS   Adj. R-squared:                  0.881
Method:                 Least Squares   F-statistic:                     82.42
Date:                Fri, 17 Jun 2022   Prob (F-statistic):           3.83e-06
Time:                        18:05:58   Log-Likelihood:                -40.579
No. Observations:                  12   AIC:                             85.16
Df Residuals:                      10   BIC:                             86.13
Df Model:                           1
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        114.6061      4.799     23.881      0.000     103.913     125.299
months         5.9196      0.652      9.078      0.000       4.467       7.372
==============================================================================
Omnibus:                       10.544   Durbin-Watson:                   2.585
Prob(Omnibus):                  0.005   Jarque-Bera (JB):                5.798
Skew:                          -1.582   Prob(JB):                       0.0551
Kurtosis:                       4.259   Cond. No.                         15.9
==============================================================================

Learn more about regression analysis

Calculate Durbin-Watson test in Python

We will use durbin_watson() function from statsmodels package,

from statsmodels.stats.stattools import durbin_watson as dwtest
import numpy as np

dwtest(resids=np.array(reg.resid))
# output
2.5848268
alternative hypothesis: true autocorrelation is not 0

As the Durbin-Watson statistics (d) is close to 2, we fail to reject the null hypothesis. Hence, we conclude that the residuals are not autocorrelated.

Enhance your skills with courses on Machine Learning and Python

References

  1. Salamon SJ, Hansen HJ, Abbott D. How real are observed trends in small correlated datasets?. Royal Society open science. 2019 Mar 20;6(3):181089.
  2. Turner SL, Forbes AB, Karahalios A, Taljaard M, McKenzie JE. Evaluation of statistical methods used in the analysis of interrupted time series studies: a simulation study. BMC medical research methodology. 2021 Dec;21(1):1-8.


This work is licensed under a Creative Commons Attribution 4.0 International License

Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.