Feature Engineering
There are times when multiple features are provided in the same feature or we have to derive some features from the existing ones. We will also try to include some extra features in our dataset so, that we can derive some interesting insights from the data we have. Also if the features derived are meaningful then they become a deciding factor in increasing the model’s accuracy significantly.
Python3
parts = df[ "datetime" ]. str .split( " " , n = 2 , expand = True ) df[ "date" ] = parts[ 0 ] df[ "time" ] = parts[ 1 ]. str [: 2 ].astype( 'int' ) df.head() |
Output:
In the above step, we have separated the date and time. Now let’s extract the day, month, and year from the date column.
Python3
parts = df[ "date" ]. str .split( "-" , n = 3 , expand = True ) df[ "day" ] = parts[ 0 ].astype( 'int' ) df[ "month" ] = parts[ 1 ].astype( 'int' ) df[ "year" ] = parts[ 2 ].astype( 'int' ) df.head() |
Output:
Whether it is a weekend or a weekday must have some effect on the ride request count.
Python3
from datetime import datetime import calendar def weekend_or_weekday(year, month, day): d = datetime(year, month, day) if d.weekday() > 4 : return 0 else : return 1 df[ 'weekday' ] = df. apply ( lambda x: weekend_or_weekday(x[ 'year' ], x[ 'month' ], x[ 'day' ]), axis = 1 ) df.head() |
Output:
Bike ride demands are also affected by whether it is am or pm.
Python3
def am_or_pm(x): if x > 11 : return 1 else : return 0 df[ 'am_or_pm' ] = df[ 'time' ]. apply (am_or_pm) df.head() |
Output:
It would be nice to have a column which can indicate whether there was any holiday on a particular day or not.
Python3
from datetime import date import holidays def is_holiday(x): india_holidays = holidays.country_holidays( 'IN' ) if india_holidays.get(x): return 1 else : return 0 df[ 'holidays' ] = df[ 'date' ]. apply (is_holiday) df.head() |
Output:
Now let’s remove the columns which are not useful for us.
Python3
df.drop([ 'datetime' , 'date' ], axis = 1 , inplace = True ) |
There may be some other relevant features as well which can be added to this dataset but let’s try to build a build with these ones and try to extract some insights as well.
Ola Bike Ride Request Forecast using ML
From telling rickshaw-wala where to go, to tell him where to come we have grown up. Yes, we are talking about online cab and bike facility providers like OLA and Uber. If you had used this app some times then you must have paid some day less and someday more for the same journey. But have you ever thought what is the reason behind it? It is because of the high demand at some hours. this is not the only factor but this is one of them.
Contact Us