Techniques

Data masking can be done using the following techniques

Substitution: The substitution method is considered one of the most efficient and reliable techniques, to achieve the desired result. In the method, any sensitive information that needs to be protected should be substituted with a fake yet realistic-looking value. Only the person with authorized access to the system will be able to look under the masked values.
- Pros: Makes the data look as realistic as possible
- Cons: Not applicable when dealing with large amounts of data that are unrelated
Before Substitution:

Participant Name	Problem Type	Score
Alena	Hard	45.33
Rory	Hard	33.21
Miguel	Easy	20
Samara	Medium	37.2

After Substitution :

Participant Name	Problem Type	Score
Alena	Hard	30.22
Rory	Hard	40.9
Miguel	Easy	50
Samara	Medium	46.24

Averaging: This method can be used in the case of numeric data. Instead of showing individual numeric data, you can replace the value in all cells with a collective average of all the values in the column. For example, if you have student details and you don’t want other students to see the total number of marks other students have got then you can change the data by averaging the marks of all the students and replacing it with the average in the column.

Participant Name	Problem Type	Score
Alena	Hard	41.84
Rory	Hard	41.84
Miguel	Easy	41.84
Samara	Medium	41.84

Shuffling: Shuffling and averaging are similar techniques so to say but there’s a difference that sets them apart. instead of replacing all the values in the column, you can simply shuffle the values around. With this nobody can tell which value belongs to which dataset because they will be in different locations.
- Pros: Deals with large amounts of data efficiently while keeping the data as realistic as possible.
- Cons: Can be undone easily if the data set is relatively small.
Before Shuffling:

Participant Name	Problem Type	Score
Alena	Hard	45.33
Rory	Hard	33.21
Miguel	Easy	20
Samara	Medium	37.2

After Shuffling:

Participant Name	Problem Type	Score
Alena	Hard	50
Rory	Hard	46.24
Miguel	Easy	30.22
Samara	Medium	40.9

Encryption: Encryption is a very common concept in cyber security and cryptography. It is achieved by completely changing the sensitive dataset in an unreadable form. What this does is ensures that no one gets to know what type of data or even what data is being represented. Only personnel who have access to the encryption key will be able to see the data.
- Pros: Masks the data effectively
- Cons: Anyone with the encryption key can easily get access to the data. Also, anyone who knows cryptography and decrypts the data with enough effort.

Nulling out or deletion: Nulling out is exactly what the name suggests you delete the values in a column by replacing them with NULL values. This is a very effective method to eliminate showing any sensitive information in a test environment.
- Pros: Very useful in situations where data is not essential
- Cons: Not applicable in test environments.

Participant Name	Problem Type	Score
Alena	Hard	NULL
Rory	Hard	NULL
Miguel	Easy	NULL
Samara	Medium	NULL

Redaction Method: In this method, you can replace the sensitive information with the same unique code or a generic value for the entirety of the column.
- Pros: Difficult to make out what the data can be therefore making the data more secure.
- Cons: this method should only be used when the values are not being used for development or QA purposes.

Participant Name	Problem Type	Score
Alena	Hard	XXXXXXXXXX
Rory	Hard	XXXXXXXXXX
Miguel	Easy	XXXXXXXXXX
Samara	Medium	XXXXXXXXXX

Date Aging: If you have dates in your data set that you don’t want to reveal then you can set the dates a little back or forth than what actually is given. For example, if you have a date set to 20-8-21 then you can set the date to 300 days back that is 01-02-21. This can also be done with any kind of numeric data. Make sure that the data in a column or row is aged to a definite number or similar algorithm
- Pros: Easy to remember the algorithm and effective masking of information
- Cons: Only appropriate for numeric data.
Original Data Set:

Participant Name	Problem Type	Score
Alena	Hard	30.22
Rory	Hard	40.9
Miguel	Easy	50
Samara	Medium	46.24

Mask data set by adding 45 to all the elements of the row:

Participant Name	Problem Type	Score
Alena	Hard	30.22
Rory	Hard	40.9
Miguel	Easy	50
Samara	Medium	46.24

What is Data Masking?

Data masking is a very important concept to keep data safe from any breaches. Especially, for big organizations that contain heaps of sensitive data that can be easily compromised. Details like credit card information, phone numbers, house addresses are highly vulnerable information that must be protected. To understand data masking better we first need to know what computer networks are.

Techniques

What is Data Masking?

Similar Reads

Contact Us