Partial Least Squares (PLS)
- PLS is primarily a regression method that seeks to establish the relationship between a set of independent variables (X) and a dependent variable (Y).
- It accomplishes this by identifying latent variables (components) in X that not only explain the variance within X but also exhibit covariance with Y.
- The primary objective of PLS is to maximize the covariance between X and Y.
- The PLS algorithm can be summarized in the following steps:
- Calculate the weight vector, w, as the direction of maximum covariance between X and Y.
- Calculate the scores, t, as the linear combination of X using w.
- Calculate the loading vector, p, as the weights for the original variables in X.
- Update X and Y by removing the explained variance.
- Repeat steps 1-4 until the desired number of latent variables are extracted.
- The PLS model can then be used to predict Y for new observations by simply calculating the linear combination of the latent variables for those observations.
- Mathematical Explanation of PLS
- Let X be a matrix of predictor variables with dimensions n x m, and let Y be a vector of response variables with dimensions n x 1.
- The PLS algorithm aims to find a set of latent variables, T, that maximize the covariance between X and Y.
- The latent variables are linear combinations of the original predictor variables, and they can be expressed as follows:
where W is a matrix of weights.
- The weights are chosen to maximize the covariance between T and Y. This can be expressed mathematically as follows:
- The solution to this optimization problem is given by the following equation:
- Once the weights have been calculated, the latent variables can be computed using the following equation:
- The PLS model can then be used to predict Y for new observations by simply calculating the linear combination of the latent variables for those observations. This can be expressed mathematically as follows:
Partial Least Squares Singular Value Decomposition (PLSSVD)
Partial Least Squares Singular Value Decomposition (PLSSVD) is a sophisticated statistical technique employed in the realms of multivariate analysis and machine learning. This method merges the strengths of Partial Least Squares (PLS) and Singular Value Decomposition (SVD), offering a powerful tool to extract crucial information from high-dimensional data while effectively mitigating issues like multicollinearity and noise.
Contact Us