Performing a Two-Way ANOVA in Python
Let us consider an example in which scientists need to know whether plant growth is affected by fertilizers and watering frequency. They planted exactly 30 plants and allowed them to grow for six months under different conditions for fertilizers and watering frequency. After exactly six months, they recorded the heights of each plant centimeters. Performing a Two-Way ANOVA in Python is a step by step process and these are discussed below:
Step 1: Import libraries.
The very first step is to import the libraries installed above.
Python3
# Importing libraries import numpy as np import pandas as pd |
Step 2: Enter the data.
Let us create a pandas DataFrame that consist of the following three variables:
- fertilizers: how frequently each plant was fertilized that is daily or weekly.
- watering: how frequently each plant was watered that is daily or weekly.
- height: the height of each plant (in inches) after six months.
Example:
Python3
# Importing libraries import numpy as np import pandas as pd # Create a dataframe dataframe = pd.DataFrame({ 'Fertilizer' : np.repeat([ 'daily' , 'weekly' ], 15 ), 'Watering' : np.repeat([ 'daily' , 'weekly' ], 15 ), 'height' : [ 14 , 16 , 15 , 15 , 16 , 13 , 12 , 11 , 14 , 15 , 16 , 16 , 17 , 18 , 14 , 13 , 14 , 14 , 14 , 15 , 16 , 16 , 17 , 18 , 14 , 13 , 14 , 14 , 14 , 15 ]}) |
Step 3: Conduct the two-way ANOVA:
To perform the two-way ANOVA, the Statsmodels library provides us with anova_lm() function. The syntax of the function is given below,
Syntax:
sm.stats.anova_lm(model, type=2)
Parameters:
- model: It represents model statistics
- type: It represents the type of Anova test to perform that is { I or II or III or 1 or 2 or 3 }
Python3
# Importing libraries import statsmodels.api as sm from statsmodels.formula.api import ols # Performing two-way ANOVA model = ols( 'height ~ C(Fertilizer) + C(Watering) + \ C(Fertilizer):C(Watering)', data = df).fit() sm.stats.anova_lm(model, typ = 2 ) |
Step 4: Combining all the steps.
Example:
Python3
# Importing libraries import statsmodels.api as sm from statsmodels.formula.api import ols # Create a dataframe dataframe = pd.DataFrame({ 'Fertilizer' : np.repeat([ 'daily' , 'weekly' ], 15 ), 'Watering' : np.repeat([ 'daily' , 'weekly' ], 15 ), 'height' : [ 14 , 16 , 15 , 15 , 16 , 13 , 12 , 11 , 14 , 15 , 16 , 16 , 17 , 18 , 14 , 13 , 14 , 14 , 14 , 15 , 16 , 16 , 17 , 18 , 14 , 13 , 14 , 14 , 14 , 15 ]}) # Performing two-way ANOVA model = ols('height ~ C(Fertilizer) + C(Watering) + \ C(Fertilizer):C(Watering)', data = dataframe).fit() result = sm.stats.anova_lm(model, type = 2 ) # Print the result print (result) |
Output:
Interpreting the result:
Following are the p-values for each of the factors in the output:
- The fertilizer p-value is equal to 0.913305
- The Watering p-value is equal to 0.990865
- The Fertilizer * Watering: p-value is equal to 0.904053
The p-values for water and sun turn out to be less than 0.05 which implies that the means of both the factors possess a statistically significant effect on plant height. The p-value for the interaction effect (0.904053) is greater than 0.05 which depicts that there is no significant interaction effect between fertilizer frequency and watering frequency.
How to Perform a Two-Way ANOVA in Python
Two-Way ANOVA: Two-Way ANOVA in statistics stands for Analysis of Variance and it is used to check whether there is a statistically significant difference between the mean value of three or more that has been divided into two factors. In simple words, ANOVA is a test conducted in statistics and it is used to interpret the difference between the mean value of at least three groups. The main objective of a two-way ANOVA is to find out how two factors affect a response variable and to find out whether there is a relation between the two factors on the response variable.
Syntax to installs pandas and NumPy libraries in the system:
pip3 install numpy pandas
Contact Us