Data masking can be done using the following techniques
Substitution: The substitution method is considered one of the most efficient and reliable techniques, to achieve the desired result. In the method, any sensitive information that needs to be protected should be substituted with a fake yet realistic-looking value. Only the person with authorized access to the system will be able to look under the masked values.
Pros: Makes the data look as realistic as possible
Cons: Not applicable when dealing with large amounts of data that are unrelated
Before Substitution:
Participant Name
Problem Type
Score
Alena
Hard
45.33
Rory
Hard
33.21
Miguel
Easy
20
Samara
Medium
37.2
After Substitution :
Participant Name
Problem Type
Score
Alena
Hard
30.22
Rory
Hard
40.9
Miguel
Easy
50
Samara
Medium
46.24
Averaging: This method can be used in the case of numeric data. Instead of showing individual numeric data, you can replace the value in all cells with a collective average of all the values in the column. For example, if you have student details and you don’t want other students to see the total number of marks other students have got then you can change the data by averaging the marks of all the students and replacing it with the average in the column.
Participant Name
Problem Type
Score
Alena
Hard
41.84
Rory
Hard
41.84
Miguel
Easy
41.84
Samara
Medium
41.84
Shuffling: Shuffling and averaging are similar techniques so to say but there’s a difference that sets them apart. instead of replacing all the values in the column, you can simply shuffle the values around. With this nobody can tell which value belongs to which dataset because they will be in different locations.
Pros: Deals with large amounts of data efficiently while keeping the data as realistic as possible.
Cons: Can be undone easily if the data set is relatively small.
Before Shuffling:
Participant Name
Problem Type
Score
Alena
Hard
45.33
Rory
Hard
33.21
Miguel
Easy
20
Samara
Medium
37.2
After Shuffling:
Participant Name
Problem Type
Score
Alena
Hard
50
Rory
Hard
46.24
Miguel
Easy
30.22
Samara
Medium
40.9
Encryption: Encryption is a very common concept in cyber security and cryptography. It is achieved by completely changing the sensitive dataset in an unreadable form. What this does is ensures that no one gets to know what type of data or even what data is being represented. Only personnel who have access to the encryption key will be able to see the data.
Pros: Masks the data effectively
Cons: Anyone with the encryption key can easily get access to the data. Also, anyone who knows cryptography and decrypts the data with enough effort.
Nulling out or deletion: Nulling out is exactly what the name suggests you delete the values in a column by replacing them with NULL values. This is a very effective method to eliminate showing any sensitive information in a test environment.
Pros: Very useful in situations where data is not essential
Cons: Not applicable in test environments.
Participant Name
Problem Type
Score
Alena
Hard
NULL
Rory
Hard
NULL
Miguel
Easy
NULL
Samara
Medium
NULL
Redaction Method: In this method, you can replace the sensitive information with the same unique code or a generic value for the entirety of the column.
Pros: Difficult to make out what the data can be therefore making the data more secure.
Cons: this method should only be used when the values are not being used for development or QA purposes.
Participant Name
Problem Type
Score
Alena
Hard
XXXXXXXXXX
Rory
Hard
XXXXXXXXXX
Miguel
Easy
XXXXXXXXXX
Samara
Medium
XXXXXXXXXX
Date Aging: If you have dates in your data set that you don’t want to reveal then you can set the dates a little back or forth than what actually is given. For example, if you have a date set to 20-8-21 then you can set the date to 300 days back that is 01-02-21. This can also be done with any kind of numeric data. Make sure that the data in a column or row is aged to a definite number or similar algorithm
Pros: Easy to remember the algorithm and effective masking of information
Cons: Only appropriate for numeric data.
Original Data Set:
Participant Name
Problem Type
Score
Alena
Hard
30.22
Rory
Hard
40.9
Miguel
Easy
50
Samara
Medium
46.24
Mask data set by adding 45 to all the elements of the row:
Participant Name
Problem Type
Score
Alena
Hard
30.22
Rory
Hard
40.9
Miguel
Easy
50
Samara
Medium
46.24
What is Data Masking?
Data masking is a very important concept to keep data safe from any breaches. Especially, for big organizations that contain heaps of sensitive data that can be easily compromised. Details like credit card information, phone numbers, house addresses are highly vulnerable information that must be protected. To understand data masking better we first need to know what computer networks are.
Contact Us