Perform Durbin-Watson (DW) test in Python
Durbin-Watson (DW) test
In regression analysis, Durbin-Watson (DW) is useful for checking the first-order autocorrelation (serial correlation). It analyzes the residuals for independence over time points (autocorrelation). The autocorrelation varies from -1 (negative autocorrelation) to 1 (positive autocorrelation).
Durbin-Watson test analyzes the following hypotheses,
Null hypothesis (H0): Residuals from the regression are not autocorrelated (autocorrelation coefficient, ρ = 0)
Alternative hypothesis (Ha): Residuals from the regression are autocorrelated (autocorrelation coefficient, ρ > 0)
Learn more about hypothesis testing and interpretation
Durbin-Watson test statistics (d) always ranges between 0 and 4. If the value is near 2, it indicates evidence of non-autocorrelation. If the value is towards 0, it indicates evidence of positive autocorrelation. If the value is towards 4, it indicates evidence of negative autocorrelation
Perform Durbin-Watson test in Python
We will use the
statsmodels package to perform Durbin-Watson test
Suppose, there is a hypothetical time-series dataset of stock prices recorded over 12 months.
import pandas as pd df = pd.read_csv("https://reneshbedre.github.io/assets/posts/reg/stock_price.csv") df.head(2) # output months stock_price 0 1 122 1 2 129
Fit the regression model
To perform the Durbin-Watson test, we first need to get regression residuals. Fit the regression
months as independent variables and
stock_price as the dependent variable,
import statsmodels.api as sm X = df['months'] # independent variable y = df['stock_price'] # dependent variable # to get intercept X = sm.add_constant(X) # fit the regression model reg = sm.OLS(y, X).fit() reg.summary() # output OLS Regression Results ============================================================================== Dep. Variable: stock_price R-squared: 0.892 Model: OLS Adj. R-squared: 0.881 Method: Least Squares F-statistic: 82.42 Date: Fri, 17 Jun 2022 Prob (F-statistic): 3.83e-06 Time: 18:05:58 Log-Likelihood: -40.579 No. Observations: 12 AIC: 85.16 Df Residuals: 10 BIC: 86.13 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ const 114.6061 4.799 23.881 0.000 103.913 125.299 months 5.9196 0.652 9.078 0.000 4.467 7.372 ============================================================================== Omnibus: 10.544 Durbin-Watson: 2.585 Prob(Omnibus): 0.005 Jarque-Bera (JB): 5.798 Skew: -1.582 Prob(JB): 0.0551 Kurtosis: 4.259 Cond. No. 15.9 ==============================================================================
Learn more about regression analysis
Calculate Durbin-Watson test in Python
We will use
durbin_watson() function from
from statsmodels.stats.stattools import durbin_watson as dwtest import numpy as np dwtest(resids=np.array(reg.resid)) # output 2.5848268 alternative hypothesis: true autocorrelation is not 0
As the Durbin-Watson statistics (d) is close to 2, we fail to reject the null hypothesis. Hence, we conclude that the residuals are not autocorrelated.
Enhance your skills with courses on Machine Learning and Python
- Machine Learning with Python
- Machine Learning for Data Analysis
- Cluster Analysis in Data Mining
- Python for Everybody Specialization
- Linear regression basics and implementation in Python
- Multiple linear regression (MLR)
- What is p value and how to calculate p value by hand
- Salamon SJ, Hansen HJ, Abbott D. How real are observed trends in small correlated datasets?. Royal Society open science. 2019 Mar 20;6(3):181089.
- Turner SL, Forbes AB, Karahalios A, Taljaard M, McKenzie JE. Evaluation of statistical methods used in the analysis of interrupted time series studies: a simulation study. BMC medical research methodology. 2021 Dec;21(1):1-8.
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.