Interpreting the Results of Factor Analysis
Once the factor analysis is complete, we can interpret the results by examining the factor loadings, which represent the correlations between the observed variables and the extracted factors. In general, loadings greater than 0.4 or 0.5 are considered significant.
R
# View the factor loadings fa$loadings |
Output:
Loadings: MR1 MR2 MR3 MR4 Sepal.Length 0.997 Sepal.Width -0.108 0.757 Petal.Length 0.861 -0.413 0.288 Petal.Width 0.801 -0.317 0.492 MR1 MR2 MR3 MR4 SS loadings 2.389 0.844 0.332 0.000 Proportion Var 0.597 0.211 0.083 0.000 Cumulative Var 0.597 0.808 0.891 0.891
The output of the loadings function shows the factor loadings for each variable and factor. We can interpret these loadings to identify the underlying factors that explain the correlations among the observed variables. In this example, it appears that the first factor is strongly associated with petal length and petal width, while the second factor is strongly associated with sepal length and sepal width.
Factor Analysis in R programming
Factor Analysis (FA) is a statistical method that is used to analyze the underlying structure of a set of variables. It is a method of data reduction that seeks to explain the correlations among many variables in terms of a smaller number of unobservable (latent) variables, known as factors. In R Programming Language, the psych package provides a variety of functions for performing factor analysis.
Factor analysis involves several steps:
- Data preparation: The data are usually standardized (i.e., scaled) to make sure that the variables are on a common scale and have equal weight in the analysis.
- Factor Extraction: The factors are identified based on their ability to explain the variance in the data. There are several methods for extracting factors, including principal components analysis (PCA), maximum likelihood estimate(MLE), and minimum residuals (MR).
- Factor Rotation: The factors are usually rotated to make their interpretation easier. The most common method of rotation is Varimax rotation, which tries to maximize the variance of the factor loadings.
- Factor interpretation: The final step involves interpreting the factors and their loadings (i.e., the correlation between each variable and each factor). The loadings represent the degree to which each variable is associated with each factor.
Loading the Data
First, we need to load the data that we want to analyze. For this example, we will use the iris dataset that comes with R. This dataset contains measurements of the sepal length, sepal width, petal length, and petal width of three different species of iris flowers.
R
# Load the dataset data (iris) # View the first few rows of the dataset head (iris) |
Output:
Data Preparation
Before conducting factor analysis, we need to prepare the data by scaling the variables to have a mean of zero and a standard deviation of one. This is important because factor analysis is sensitive to differences in scale between variables.
R
# Scale the data iris_scaled <- scale (iris[,1:4]) |
Determining the Number of Factors
The next step is to determine the number of factors to extract from the data. This can be done using a variety of methods, such as the Kaiser criterion, scree plot, or parallel analysis. In this example, we will use the Kaiser criterion, which suggests extracting factors with eigenvalues greater than one.
R
# Perform factor analysis library (psych) fa <- fa (r = iris_scaled, nfactors = 4, rotate = "varimax" ) summary (fa) |
Output:
Factor analysis with Call: fa(r = iris_scaled, nfactors = 4, rotate = "varimax") Test of the hypothesis that 4 factors are sufficient. The degrees of freedom for the model is -4 and the objective function was 0 The number of observations was 150 with Chi Square = 0 with prob < NA The root mean square of the residuals (RMSA) is 0 The df corrected root mean square of the residuals is NA Tucker Lewis Index of factoring reliability = 1.009
The output of the summary() function shows the results of the factor analysis, including the number of factors extracted, the eigenvalues for each factor, and the percentage of variance explained by each factor.
This summary shows that the factor analysis extracted 2 factors, and provides the standardized loadings (or factor loadings) for each variable on each factor. It also shows the eigenvalues and proportion of variance explained by each factor, as well as the results of a test of the hypothesis that 2 factors are sufficient. The goodness of fit statistic is also reported.
Contact Us