Polynomial Regression
Polynomial regression is a type of regression analysis where the relationship between the independent variable(s) and the dependent variable is modeled as an nth-degree polynomial. Unlike simple or multiple linear regression, which assume a linear relationship between variables, polynomial regression can capture non-linear relationships between variables.
y=β0 +β1 x + β2 x2 +…+ βn xn + ε
Where:
- y is the dependent variable.
- x is the independent variable.
- x2, x3, …, xn are the polynomial terms, representing the square, cube, and higher-order terms of x.
- β0 , β1 , … , βn are the coefficients, representing the effect of each polynomial term on y.
- ε is the error term, representing the difference between the observed and predicted values of y.
Polynomial regression extends linear regression by considering polynomial terms of the independent variable.
- Use the poly() function in R to generate polynomial terms.
- Fit the model using lm() with polynomial terms included.
# Sample data
x <- c(1, 2, 3, 4, 5)
y <- c(2, 4, 5, 4, 6)
# Fit polynomial regression model (e.g., quadratic)
model <- lm(y ~ poly(x, degree = 2))
# Summary of the model
summary(model)
Output:
Call:
lm(formula = y ~ poly(x, degree = 2))
Residuals:
1 2 3 4 5
-0.3143 0.4571 0.5143 -1.1429 0.4857
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.2000 0.4598 9.134 0.0118 *
poly(x, degree = 2)1 2.5298 1.0282 2.460 0.1330
poly(x, degree = 2)2 -0.5345 1.0282 -0.520 0.6550
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.028 on 2 degrees of freedom
Multiple R-squared: 0.7597, Adjusted R-squared: 0.5195
F-statistic: 3.162 on 2 and 2 DF, p-value: 0.2403
This output is from a polynomial regression model fitted to data with one independent variable (x) and one dependent variable (y).
- Residuals: Differences between observed and predicted values of y.
- Coefficients: Estimates of intercept and polynomial terms (poly(x, degree = 2)).
- Estimate: Coefficient values.
- Std. Error: Standard errors.
- t value: T-statistics for significance.
- Pr(>|t|): P-values indicating significance.
- Model Fit:
- Residual standard error: Standard deviation of residuals.
- Multiple R-squared: Approximately 75.97% of variance explained.
- Adjusted R-squared: Adjusted for predictors.
- F-statistic: Tests overall model significance.
How to proceed from Simple to Multiple and Polynomial Regression in R
Regression analysis allows us to understand how one or more independent variables relate to a dependent variable. Simple linear regression, which explores the relationship between two variables. Multiple linear regression extends this to include several predictors simultaneously. Finally, polynomial regression introduces flexibility by accommodating non-linear relationships in the R Programming Language.
Contact Us