Creating a correlation matrix

We will take a sample dataset for explaining our approach better. We will take the inbuilt USArrests dataset, and we will visualize its correlation matrix following the above approach. We will read the data using the data() function, and we will create the correlation matrix with the help of cor() function to compute the correlation coefficient. The round() function is used to round off the values to a specific decimal value. We will use cor_pmat() function to compute the correlation matrix with p-values.

Syntax : 

correlation_matrix <- round(cor(data),1)

Parameters : 

  • correlation_matrix : Variable for correlation matrix used to visualize.
  • data : data is our dataset which we have taken for visualization. 

Syntax:

corrp.mat <- cor_pmat(data)

Parameters :

  • corrp.mat : Variable for correlation matrix with p-values.
  • data : It is our dataset taken for creating correlation matrix with p-values.

Example: Creating a correlation matrix

R




# Installing and loading the ggcorrplot package
install.packages("ggcorrplot")
library(ggcorrplot)
  
# Reading the data
data(USArrests)
  
# Computing correlation matrix
correlation_matrix <- round(cor(USArrests),1)
  
head(correlation_matrix[, 1:4])
  
# Computing correlation matrix with p-values
corrp.mat <- cor_pmat(USArrests)
  
head(corrp.mat[, 1:4])


Output :

Visualization of a correlation matrix using ggplot2 in R

In this article, we will discuss how to visualize a correlation matrix using ggplot2 package in R programming language.

In order to do this, we will install a package called ggcorrplot package. With the help of this package, we can easily visualize a correlation matrix. We can also compute a matrix of correlation p-values by using a function that is present in this package. The corr_pmat() is used for computing the correlation matrix of p-values and the ggcorrplot() is used for displaying the correlation matrix using ggplot.

Syntax : 

corr_pmat(x,..)

Where x is the dataframe or the matrix

Syntax:

ggcorrplot(corr, method = c(“circle”, “square”), type = c(“full”, “lower”, “upper”), title = “”, ggtheme=ggplot2::theme_minimal, show.legend = TRUE, legend.title = “corr”, show.diag = FALSE, colors = c(“blue”, “white”, “red”), outline.color = “gray”, hc.order = FALSE, hc.method = “complete”, lab = FALSE, lab_col =”black”, p.mat = NULL,.. )

Similar Reads

Getting Started

We will first install and load the ggcorrplot and ggplot2 package using the install.packages() to install and library() to load the package. We need a dataset to construct our correlation matrix and then visualize it. We will create our correlation matrix with the help of cor() function, which computes the correlation coefficient. After computing the correlation matrix, we will compute the matrix of correlation p-values using the corr_pmat() function. Next, we will visualize the correlation matrix with the help of ggcorrplot() function using ggplot2....

Creating a correlation matrix

We will take a sample dataset for explaining our approach better. We will take the inbuilt USArrests dataset, and we will visualize its correlation matrix following the above approach. We will read the data using the data() function, and we will create the correlation matrix with the help of cor() function to compute the correlation coefficient. The round() function is used to round off the values to a specific decimal value. We will use cor_pmat() function to compute the correlation matrix with p-values....

Visualizing correlation matrix

...

Reordering the correlation matrix

Now since we have a correlation matrix and the correlation matrix with p-values, we will now try to visualize this correlation matrix. The first visualization is to use the ggcorrplot() function and plot our correlation matrix in the form of the square and circle method....

Introducing correlation coefficient

...

Adding significance level

...

Leaving blank on no significance level

We will now visualize our correlation matrix by reordering the matrix using hierarchical clustering. We will do this using the ggcorrplot function with correlation matrix, hc.order, outline.color as arguments....

Contact Us