Stepwise Implementation
Step 1: Importing Libraries
Python3
# importing pandas as pd import pandas as pd |
Step 2 : Importing Data
Python3
#importing data using .read_csv() function df = pd.read_csv( 'data.csv' ) #printing DataFrame df |
Output:
Step 3 : Converting Categorical Data Columns to Numerical.
We will convert the column ‘Purchased’ from categorical to numerical data type.
Python3
# Importing LabelEncoder from Sklearn # library from preprocessing Module. from sklearn.preprocessing import LabelEncoder # Creating a instance of label Encoder. le = LabelEncoder() # Using .fit_transform function to fit label # encoder and return encoded label label = le.fit_transform(df[ 'Purchased' ]) # printing label label |
Output:
array([0, 1, 0, 0, 1, 1, 0, 1, 0, 1])
Time Complexity: O(n log n) to O(n^2) because it involves sorting and finding unique values in the input data. Here, n is the number of elements in the df[‘Purchased’] array.
Auxiliary Space: O(k) where k is the number of unique labels in the df[‘Purchased’] array.
Step 4: Appending The Label Array to our DataFrame
Python3
# removing the column 'Purchased' from df # as it is of no use now. df.drop( "Purchased" , axis = 1 , inplace = True ) # Appending the array to our dataFrame # with column name 'Purchased' df[ "Purchased" ] = label # printing Dataframe df |
Output:
How to convert categorical string data into numeric in Python?
The datasets have both numerical and categorical features. Categorical features refer to string data types and can be easily understood by human beings. However, machines cannot interpret the categorical data directly. Therefore, the categorical data must be converted into numerical data for further processing.
There are many ways to convert categorical data into numerical data. Here in this article, we’ll be discussing the two most used methods namely :
- Dummy Variable Encoding
- Label Encoding
In both the Methods we are using the same data, the link to the dataset is here
Contact Us