How to Normalize Data in R?
In this article, we will discuss how to normalize data in the R programming language.
What is Normalization?
Normalization is a pre-processing stage of any type of problem statement. In particular, normalization takes an important role in the field of soft computing, cloud computing, etc. for the manipulation of data, scaling down, or scaling up the range of data before it becomes used for further stages. There are so many normalization techniques there, namely Min-Max normalization, Z-score normalization, and Decimal scaling normalization.
What is Data Normalization?
Data transformation operations, such as normalization and aggregation, are additional data preprocessing procedures that would contribute toward the success of the data extraction process.
Data normalization consists of remodeling numeric columns to a standard scale. Data normalization is generally considered the development of clean data.
Method 1: Normalize data with log transformation in base R
In this approach to normalize the data with its log transformation, the user needs to call the log() which is an inbuilt function, and pass the data frame as its parameter to transform the given data to its log and the resulting data will then be transformed to the scale.
log() function is used to compute logarithms, by default natural logarithms.
Syntax:
log(x)
Parameters:
- x: a numeric or complex vector.
Example: Normalize data
R
# Create data gfg <- c (244, 753, 596, 645, 874, 141, 639, 465, 999, 654) # normalizing data gfg1 <- log (gfg) gfg1 |
Output:
[1] 5.497168 6.624065 6.390241 6.469250 6.773080 4.948760 6.459904 6.142037 6.906755
[10] 6.483107
Method 2: Normalize Data with Standard Scaling in R
In this method to normalize the data, the user simply needs to call the scale() function which is an inbuilt function, and pass the data which is needed to be scaled, and further, this will be resulting in normalized data from range -1 to 1 in the R programming language.
Scale() is a generic function whose default method centers and/or scales the columns of a numeric matrix.
Syntax:
scale(x)
Parameters:
- x: Data
Example: Normalize data
R
# Create data gfg <- c (244,753,596,645,874,141,639,465,999,654) # normalizing data gfg <- as.data.frame ( scale (gfg)) gfg |
Output:
V1
1 -1.36039519
2 0.57921588
3 -0.01905315
4 0.16766775
5 1.04030220
6 -1.75289016
7 0.14480397
8 -0.51824578
9 1.51663105
10 0.20196343
Method 3: Normalize Data using Min-Max Scaling
In this method to normalize, the user has to first install and import the caret package in the R working console, and then the user needs to call the preProcess() function with the method passed as the range as its parameters, and then the user calls the predict() function to get the final normalize data which will lead to the normalization of the given data to the scale from 0 to 1 in the R programming language.
preprocess () function is used for transformation can be estimated from the training data and applied to any data set with the same variables.
Syntax:
preProcess(x,method)
Parameters:
- x: Data
- method: a character vector specifying the type of processing.
Example: Normalize data
R
library (caret) # Create data gfg <- c (244,753,596,645,874,141,639,465,999,654) # normalizing data ss <- preProcess ( as.data.frame (gfg), method= c ( "range" )) gfg <- predict (ss, as.data.frame (gfg)) gfg |
Output:
gfg
1 0.1200466
2 0.7132867
3 0.5303030
4 0.5874126
5 0.8543124
6 0.0000000
7 0.5804196
8 0.3776224
9 1.0000000
10 0.5979021
Method 4: Normalize Data using Z-Score Standardization
In statistics, the task is to standardize variables which is called evaluating z-scores. Comparing two standardizing variables is the function of standardizing vector. By subtracting the vector by its mean and dividing the result by the vector’s standard deviation we can standardize a vector.
R
# Input vector gfg <- c (244, 753, 596, 645, 874, 141, 639, 465, 999, 654) # Z-score standardization gfg_standardized <- (gfg - mean (gfg)) / sd (gfg) # View the standardized vector print (gfg_standardized) |
Output:
[1] -1.36039519 0.57921588 -0.01905315 0.16766775 1.04030220 -1.75289016
[7] 0.14480397 -0.51824578 1.51663105 0.20196343
Contact Us