Visualization
Now let us go into the visualization of the studentized residual. With the help of matplotlib we can make a plot of the predictor variable values VS the corresponding studentized residuals.
Example:
Python3
# Python program to draw the plot # of stundenterized residual # Importing necessary packages import numpy as np import pandas as pd import statsmodels.api as sm from statsmodels.formula.api import ols import matplotlib.pyplot as plt # Creating dataframe dataframe = pd.DataFrame({ 'Score' : [ 80 , 95 , 80 , 78 , 84 , 96 , 86 , 75 , 97 , 89 ], 'Benchmark' : [ 27 , 28 , 18 , 18 , 29 , 30 , 25 , 25 , 24 , 29 ]}) # Building simple linear regression model simple_regression_model = ols( 'Score ~ Benchmark' , data = dataframe).fit() # Producing studentized residual result = simple_regression_model.outlier_test() # Defining predictor variable values and # studentized residuals x = dataframe[ 'Score' ] y = result[ 'student_resid' ] # Creating a scatterplot of predictor variable # vs studentized residuals plt.scatter(x, y) plt.axhline(y = 0 , color = 'black' , linestyle = '--' ) plt.xlabel( 'Points' ) plt.ylabel( 'Studentized Residuals' ) # Save the plot plt.savefig( "Plot.png" ) |
Output:
Plot.png:
How to Calculate Studentized Residuals in Python?
Studentized residual is a statistical term and it is defined as the quotient obtained by dividing a residual by its estimated standard deviation. This is a crucial technique used in the detection of outlines. Practically, one can claim that any type of observation in a dataset having a studentized residual of more than 3 (absolute value) is an outlier.
The following Python libraries should already be installed in our system:
- pandas
- numpy
- statsmodels
You can install these packages on your system by using the below command on the terminal.
pip3 install pandas numpy statsmodels matplotlib
Contact Us