# Perform Durbin-Watson (DW) test in Python

## Durbin-Watson (DW) test

In regression analysis, Durbin-Watson (DW) is useful for checking the first-order autocorrelation (serial correlation). It analyzes the residuals for independence over time points (autocorrelation). The autocorrelation varies from -1 (negative autocorrelation) to 1 (positive autocorrelation).

Durbin-Watson test analyzes the following hypotheses,

Null hypothesis (H0): Residuals from the regression are not autocorrelated (autocorrelation coefficient, ρ = 0)
Alternative hypothesis (Ha): Residuals from the regression are autocorrelated (autocorrelation coefficient, ρ > 0)

Durbin-Watson test statistics (d) always ranges between 0 and 4. If the value is near 2, it indicates evidence of non-autocorrelation. If the value is towards 0, it indicates evidence of positive autocorrelation. If the value is towards 4, it indicates evidence of negative autocorrelation

## Perform Durbin-Watson test in Python

We will use the `statsmodels` package to perform Durbin-Watson test

### Get dataset

Suppose, there is a hypothetical time-series dataset of stock prices recorded over 12 months.

``````import pandas as pd
# output
months  stock_price
0       1          122
1       2          129
``````

#### Fit the regression model

To perform the Durbin-Watson test, we first need to get regression residuals. Fit the regression model with `months` as independent variables and `stock_price` as the dependent variable,

``````import statsmodels.api as sm
X = df['months']   # independent variable
y = df['stock_price']   # dependent variable
# to get intercept
# fit the regression model
reg = sm.OLS(y, X).fit()
reg.summary()

# output
OLS Regression Results
==============================================================================
Dep. Variable:            stock_price   R-squared:                       0.892
Method:                 Least Squares   F-statistic:                     82.42
Date:                Fri, 17 Jun 2022   Prob (F-statistic):           3.83e-06
Time:                        18:05:58   Log-Likelihood:                -40.579
No. Observations:                  12   AIC:                             85.16
Df Residuals:                      10   BIC:                             86.13
Df Model:                           1
Covariance Type:            nonrobust
==============================================================================
coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        114.6061      4.799     23.881      0.000     103.913     125.299
months         5.9196      0.652      9.078      0.000       4.467       7.372
==============================================================================
Omnibus:                       10.544   Durbin-Watson:                   2.585
Prob(Omnibus):                  0.005   Jarque-Bera (JB):                5.798
Skew:                          -1.582   Prob(JB):                       0.0551
Kurtosis:                       4.259   Cond. No.                         15.9
==============================================================================
``````

#### Calculate Durbin-Watson test in Python

We will use `durbin_watson()` function from `statsmodels` package,

``````from statsmodels.stats.stattools import durbin_watson as dwtest
import numpy as np

dwtest(resids=np.array(reg.resid))
# output
2.5848268
alternative hypothesis: true autocorrelation is not 0
``````

As the Durbin-Watson statistics (d) is close to 2, we fail to reject the null hypothesis. Hence, we conclude that the residuals are not autocorrelated.

## References

1. Salamon SJ, Hansen HJ, Abbott D. How real are observed trends in small correlated datasets?. Royal Society open science. 2019 Mar 20;6(3):181089.
2. Turner SL, Forbes AB, Karahalios A, Taljaard M, McKenzie JE. Evaluation of statistical methods used in the analysis of interrupted time series studies: a simulation study. BMC medical research methodology. 2021 Dec;21(1):1-8.