Validating the Results of Factor Analysis

Finally, it is important to validate the results of the factor analysis by checking the assumptions of the technique, such as normality and linearity. Additionally, it is important to examine the factor structure for different subsets of the data to ensure that the results are consistent and stable.


# examine factor structure for 
# different subsets of the data
subset1 <- subset(iris[,1:4],
                  iris$Sepal.Length < mean(iris$Sepal.Length))
fa1 <- fa(subset1, nfactors = 4)


Factor Analysis using method =  minres
Call: fa(r = subset1, nfactors = 4)
Standardized loadings (pattern matrix) based upon correlation matrix
               MR1  MR2   MR3 MR4   h2    u2 com
Sepal.Length  0.66 0.61 -0.12   0 0.82 0.178 2.1
Sepal.Width  -0.68 0.61  0.11   0 0.85 0.150 2.0
Petal.Length  1.00 0.00  0.00   0 1.00 0.005 1.0
Petal.Width   0.97 0.01  0.16   0 0.97 0.031 1.1

                       MR1  MR2  MR3  MR4
SS loadings           2.85 0.74 0.05 0.00
Proportion Var        0.71 0.18 0.01 0.00
Cumulative Var        0.71 0.90 0.91 0.91
Proportion Explained  0.78 0.20 0.01 0.00
Cumulative Proportion 0.78 0.99 1.00 1.00

Mean item complexity =  1.5
Test of the hypothesis that 4 factors are sufficient.

The degrees of freedom for the null model are  6  and the objective function was
4.57 with Chi Square of  351.02
The degrees of freedom for the model are -4  and the objective function was  0 

The root mean square of the residuals (RMSR) is  0 
The df corrected root mean square of the residuals is  NA 

The harmonic number of observations is  80 with the empirical chi square  0  with prob <  NA 
The total number of observations was  80  with Likelihood Chi Square =  0  with prob <  NA 

Tucker Lewis Index of factoring reliability =  1.018
Fit based upon off diagonal values = 1
Measures of factor score adequacy             
                                                   MR1  MR2   MR3 MR4
Correlation of (regression) scores with factors   1.00 0.91  0.69   0
Multiple R square of scores with factors          1.00 0.82  0.47   0
Minimum correlation of possible factor scores     0.99 0.64 -0.05  -1


subset2 <- subset(iris[,1:4],
                  iris$Sepal.Length >= mean(iris$Sepal.Length))
fa2 <- fa(subset2, nfactors = 4)


Factor Analysis using method =  minres
Call: fa(r = subset2, nfactors = 4)
Standardized loadings (pattern matrix) based upon correlation matrix
              MR1   MR2   MR3 MR4   h2    u2 com
Sepal.Length 0.76 -0.37  0.26   0 0.78 0.222 1.7
Sepal.Width  0.50  0.36  0.34   0 0.49 0.507 2.6
Petal.Length 0.95 -0.23 -0.22   0 1.00 0.005 1.2
Petal.Width  0.82  0.39 -0.20   0 0.86 0.144 1.6

                       MR1  MR2  MR3  MR4
SS loadings           2.39 0.46 0.27 0.00
Proportion Var        0.60 0.12 0.07 0.00
Cumulative Var        0.60 0.71 0.78 0.78
Proportion Explained  0.76 0.15 0.09 0.00
Cumulative Proportion 0.76 0.91 1.00 1.00

Mean item complexity =  1.8
Test of the hypothesis that 4 factors are sufficient.

The degrees of freedom for the null model are  6  and the objective function was
1.97 with Chi Square of  131.96
The degrees of freedom for the model are -4  and the objective function was  0 

The root mean square of the residuals (RMSR) is  0 
The df corrected root mean square of the residuals is  NA 

The harmonic number of observations is  70 with the empirical chi square  0  with prob <  NA 
The total number of observations was  70  with Likelihood Chi Square =  0  with prob <  NA 

Tucker Lewis Index of factoring reliability =  1.05
Fit based upon off diagonal values = 1
Measures of factor score adequacy             
                                                   MR1  MR2  MR3 MR4
Correlation of (regression) scores with factors   0.98 0.86 0.75   0
Multiple R square of scores with factors          0.96 0.75 0.57   0
Minimum correlation of possible factor scores     0.92 0.49 0.14  -1


# display variance explained by each factor


                            MR1       MR2        MR3          MR4
SS loadings           2.8853608 0.5816336 0.09819492 4.000000e-30
Proportion Var        0.7213402 0.1454084 0.02454873 1.000000e-30
Cumulative Var        0.7213402 0.8667486 0.89129733 8.912973e-01
Proportion Explained  0.8093149 0.1631424 0.02754269 1.121960e-30
Cumulative Proportion 0.8093149 0.9724573 1.00000000 1.000000e+00

Factor Analysis using factanal( ) function:

The factanal() function is used to perform factor analysis on a data set. The factanal() function takes several arguments described below


factanal(x, factors, rotation, scores, covmat)


  • x – The data set to be analyzed.
  • factors – The number of factors to extract.
  • rotation – The rotation method to use. Popular rotation methods include varimax, oblimin, and promax.
  • scores – Whether to compute factor scores for each observation.
  • covmat – A covariance matrix to use instead of the default correlation matrix.

The output of factanal() function includes several pieces of information, including:

  • Uniquenesses: The amount of variance in each variable that is not accounted for by the factors.
  • Loadings: The correlations between each variable and each factor.
  • Communalities: The amount of variance in each variable that is accounted for by the factors.
  • Eigenvalues: The amount of variance explained by each factor.
  • Factor Correlations: The correlations between the factors.

Here is an example code snippet that demonstrates how to use factanal() function in R:


# Install the required package
# Load the psych package for 
# data analysis and visualization
# Load the mtcars dataset
# Perform factor analysis on the mtcars dataset
factor_analysis <- factanal(mtcars,
                            factors = 3,
                            rotation = "varimax")
# Print the results


factanal(x = mtcars, factors = 3, rotation = "varimax")

  mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb 
0.135 0.055 0.090 0.127 0.290 0.060 0.051 0.223 0.208 0.125 0.158 

     Factor1 Factor2 Factor3
mpg   0.643  -0.478  -0.473 
cyl  -0.618   0.703   0.261 
disp -0.719   0.537   0.323 
hp   -0.291   0.725   0.513 
drat  0.804  -0.241         
wt   -0.778   0.248   0.524 
qsec -0.177  -0.946  -0.151 
vs    0.295  -0.805  -0.204 
am    0.880                 
gear  0.908           0.224 
carb  0.114   0.559   0.719 

               Factor1 Factor2 Factor3
SS loadings      4.380   3.520   1.578
Proportion Var   0.398   0.320   0.143
Cumulative Var   0.398   0.718   0.862

Test of the hypothesis that 3 factors are sufficient.
The chi square statistic is 30.53 on 25 degrees of freedom.
The p-value is 0.205

In this example, we load the psych package, which provides functions for data analysis and visualization, and the mtcars data set, which contains information about different car models. We then use the factanal() function to perform factor analysis on the mtcars data set, specifying that we want to extract three factors and use the varimax rotation method. Finally, we print the results of the factor analysis.


In conclusion, factor analysis is a useful statistical technique for identifying underlying factors or latent variables that explain the correlations among a set of observed variables. In R programming, the psych package provides a range of functions for conducting factor analysis, which can be used to extract meaningful insights from complex datasets.

