Normalization (Or Min-Max scaling) of data in Excel
It is the process of scaling data in such a way that all data points lie in a range of 0 to 1. Thus, this technique, makes it possible to bring all data points to a common scale. The mathematical formula for normalization is given as:
X^{'} = \frac{X - X_{min}}{X_{max}-X_{min}}
where,
- X is the data point,
- Xmax and Xmin are the maximum and minimum values in the group of records respectively.
The process of normalization is generally used when the distribution of data does not follow the Gaussian distribution.
Example:
Let’s have a look at one example to see how can we perform normalization on a sample dataset. Suppose, we have a record of the height of 10 students inside a class as shown below:
Height (in cm) |
---|
152 |
155 |
168 |
175 |
153 |
162 |
173 |
166 |
158 |
156 |
Step 1: Calculate the minimum value in the distribution. It can be calculated using the MIN() function. The minimum value comes out to be 152 which is stored in the B14 cell.
Step 2: Calculate the maximum value in the distribution. It can be calculated using the MAX() function. The maximum value comes out to be 175 which is stored in the B15 cell.
Step 3: Find the difference between the maximum and minimum values. Their difference comes out to be 175 – 152 = 23 which is stored in the B16 cell.
Step 4: For the first data stored in the A2 cell, we will calculate the normalized value as shown in the below video.
Step 5: We can manually calculate all values one by one for each data record or we can directly get values for all the other cells using the auto-fill feature of Excel. For this, go to the right corner of the B2 cell until a (+) symbol appears, and then drag the cursor to the bottom to auto-populate values inside all the cells.
Note: While calculating the first normalized value in the B2 cell, it should be made sure that the reference address for the B14 and B16 cells should be locked using Fn + F4 button otherwise an error will be thrown.
If we have a close look at the results, we can notice all the values lies in the range 0 to 1.
How to Normalize Data in Excel?
The term “normalization” is a popular buzzword among professionals in fields like Machine Learning, Data Science, and Statistics. It refers to the process of scaling down values to fit within a specific range. The term is often misunderstood and is sometimes used interchangeably with “standardization,” which is another statistical concept.
Here, we are going to demystify both of these terms and later we will read how we can implement these techniques on a sample dataset in Excel.
Contact Us