Scaling Variables Parallel Coordinates chart in R
To analyse and visualise high-dimensional data, one can use Parallel Coordinates. A background is drawn consisting of n parallel lines, often vertical and evenly spaced, to display a set of points in an n-dimensional space. A point in n-dimensional space is represented by a polyline with vertices on parallel axes; the ith coordinate of the point corresponds to the position of the vertex on the ith axis.
Scaling Variables Parallel Coordinates chart in R Programming Language
This representation is similar to time series visualization, except that it is used with data that does not have a natural order because the axes do not correlate to points in time. As a result, several axis layouts may be of interest.
Used Module:
- GGally: It extends ggplot2 by adding several functions to reduce the complexity of combining geoms with transformed data. It can be installed with the following commands:
install.packages("GGally")
- hrbrthemes: It is a compilation of extra ‘ggplot2’ themes for axis and plot.
install.packages("hrbrthemes")
To plot the Parallel Coordinates we will use ggparcoord() method.
Syntax: ggparcoord( data, columns = 1:ncol(data), groupColumn = NULL, scale = “std”, scaleSummary = “mean”, centerObsID = 1, missing = “exclude”, order = columns, showPoints = FALSE, splineFactor = FALSE, alphaLines = 1, boxplot = FALSE, shadeBox = NULL, mapping = NULL, title = “”)
Parameters:
- data: Dataset
- columns: Vector of variables (either names or indices) to be axes in the plot
- groupColumn: Single variable to group (color) by
- scale: Method used to scale the variables (see Details)
- scaleSummary: if scale==”center”, summary statistic to univariately center each variable by
- centerObsID: if scale==”centerObs”, row number of case plot should univariately be centered on
- missing: Method used to handle missing values (see Details)
- order: Method used to order the axes (see Details)
- showPoints: logical operator indicating whether points should be plotted or not
Example 1: Without Scaling
Here we will see without using a scaling variable. For this, we will not use scale attributes.
R
# Libraries library (GGally) library (viridis) # provide the color palette library (hrbrthemes) # provides themes for axis and plot # default data in R data <- iris # glimpse of the data head (data) # plotting the Parallel Coordinates ggparcoord (data, # data columns = 1:3, # plotting first 3 columns alphaLines = .4, # transparency of the color groupColumn = 5, order = "anyClass" , showPoints = TRUE ) + theme ( plot.title = element_text (size=10) ) |
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa
Example 2: With MinMax Scaling
Here we will use mixmax scaling variable with scale = “globalminmax”.
R
# Libraries library (GGally) library (viridis) # provide the color palette library (hrbrthemes) # provides themes for axis and plot # default data in R data <- iris # glimpse of the data head (data) # plotting the Parallel Coordinates ggparcoord (data, # data columns = 1:3, # plotting first 3 columns alphaLines = .4, # transparency of the color groupColumn = 5, order = "anyClass" , scale = "globalminmax" , showPoints = TRUE ) + theme ( plot.title = element_text (size=10) ) |
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa
Example 3: Scaling with Standardisation
Here we will use Standardisation scaling variable with scale = “std”.
R
# Libraries library (GGally) library (viridis) # provide the color palette library (hrbrthemes) # provides themes for axis and plot # default data in R data <- iris # glimpse of the data head (data) # plotting the Parallel Coordinates ggparcoord (data, # data columns = 1:3, # plotting first 3 columns alphaLines = .4, # transparency of the color groupColumn = 5, order = "anyClass" , scale = "std" , showPoints = TRUE ) + theme ( plot.title = element_text (size=10) ) |
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa 6 5.4 3.9 1.7 0.4 setosa
Contact Us