Heteroscedasticity in Regression Analysis

Simple Linear Regression:

(1)  

Multiple Linear Regression:

(2)  

Assumptions:
1. Error has zero mean
2. Error has constant variance
3. Errors are uncorrelated
4. Errors are normally distributed
second assumption
Homoscedasticity
Heteroscedasticity

Homoscedasticity vs Heteroscedasticity:

heteroscedasticity

Possible reasons of arising Heteroscedasticity:

  1. Often occurs in those data sets which have a large range between the largest and the smallest observed values i.e. when there are outliers.
  2. When model is not correctly specified.
  3. If observations are mixed with different measures of scale.
  4. When incorrect transformation of data is used to perform the regression.
  5. Skewness in the distribution of a regressor, and may be some other sources.

Effects of Heteroscedasticity:

  • As mentioned above that one of the assumption (assumption number 2) of linear regression is that there is no heteroscedasticity. Breaking this assumption means that OLS (Ordinary Least Square) estimators are not the Best Linear Unbiased Estimator(BLUE) and their variance is not the lowest of all other unbiased estimators.
  • Estimators are no longer best/efficient.
  • The tests of hypothesis (like t-test, F-test) are no longer valid due to the inconsistency in the co-variance matrix of the estimated regression coefficients.
Identifying Heteroscedasticity with residual plots:
Identifying Heteroscedasticity Through Statistical Tests:
  1. The Breush – Pegan Test: It tests whether the variance of the errors from regression is dependent on the values of the independent variables. In that case, heteroskedasticity is present.
  2. White test: White test establishes whether the variance of the errors in a regression model is constant. To test for constant variance one undertakes an auxiliary regression analysis: this regresses the squared residuals from the original regression model onto a set of regressors that contain the original regressors along with their squares and cross-products.

Corrections for heteroscedasticity:

  1. We can use different specification for the model.
  2. Weighted Least Squares method is one of the common statistical method. This is the generalization of ordinary least square and linear regression in which the errors co-variance matrix is allowed to be different from an identity matrix.
  3. Use MINQUE: The theory of Minimum Norm Quadratic Unbiased Estimation (MINQUE) involves three stages. First, defining a general class of potential estimators as quadratic functions of the observed data, where the estimators relate to a vector of model parameters. Secondly, specifying certain constraints on the desired properties of the estimators, such as unbiasedness and third, choosing the optimal estimator by minimizing a “norm” which measures the size of the covariance matrix of the estimators.
https://en.wikipedia.org/wiki/Heteroscedasticity

Contact Us