How to use factors() to Get Label Encoding in R programming In R Language
The factors method in base R is used to transform the given data into categorical variables. The values are assigned to each of the variables. In case, we wish to use the numerical instances, we can simply use as.numeric() method for the conversion.
Syntax : factor(x)
Arguments : x – The vector to be encoded
In the following code, the data contained in the companies vector is first sorted lexicographically. The levels are then assigned to the values and mapped to integers beginning with 1. The word “w3wiki” is assigned 1 level, and all its occurrences are replaced with 1 in the final output.
R
# creating a data vector companies = c ( "Geekster" , "TCS" , "Geekster" , "Geekster" , "w3wiki" , "Wipro" , "Geekster" , "w3wiki" , "Geekster" , "Wipro" , "TCS" ) # printing the original vector print ( "Original Data" ) print (companies) # converting the data to factors factors <- factor (companies) # converting data to label encoded values print ( "Label Encoded Data" ) # printing the numeric equivalents of these vector values print ( as.numeric (factors)) |
Output :
Label Encoding in R programming
The data that has to be processed for performing manipulations and Analyses should be easily understood and well denoted. The computer finds it difficult to process strings and other objects when data training and predictions based on it have to be performed. Label encoding is a mechanism to assign numerical values to the string variables so that they are easily transformed and fed into various models. Therefore label encoders typically perform the conversion of categorical variables into integral values. Decoders perform the reverse operation.
Contact Us