Feature Engineering

There are times when multiple features are provided in the same feature or we have to derive some features from the existing ones. We will also try to include some extra features in our dataset so, that we can derive some interesting insights from the data we have. Also if the features derived are meaningful then they become a deciding factor in increasing the model’s accuracy significantly.

Python3




parts = df["datetime"].str.split(" ", n=2, expand=True)
df["date"] = parts[0]
df["time"] = parts[1].str[:2].astype('int')
df.head()


Output:

Addition of date and time feature

In the above step, we have separated the date and time. Now let’s extract the day, month, and year from the date column.

Python3




parts = df["date"].str.split("-", n=3, expand=True)
df["day"] = parts[0].astype('int')
df["month"] = parts[1].astype('int')
df["year"] = parts[2].astype('int')
df.head()


Output:

Addition of day, month and year feature

Whether it is a weekend or a weekday must have some effect on the ride request count.

Python3




from datetime import datetime
import calendar
  
  
def weekend_or_weekday(year, month, day):
  
    d = datetime(year, month, day)
    if d.weekday() > 4:
        return 0
    else:
        return 1
  
  
df['weekday'] = df.apply(lambda x:
                         weekend_or_weekday(x['year'],
                                            x['month'],
                                            x['day']),
                         axis=1)
df.head()


Output:

Addition of a weekday feature

Bike ride demands are also affected by whether it is am or pm.

Python3




def am_or_pm(x):
    if x > 11:
        return 1
    else:
        return 0
  
  
df['am_or_pm'] = df['time'].apply(am_or_pm)
df.head()


Output:

Addition of feature including am or pm information

It would be nice to have a column which can indicate whether there was any holiday on a particular day or not.

Python3




from datetime import date
import holidays
  
  
def is_holiday(x):
  
    india_holidays = holidays.country_holidays('IN')
  
    if india_holidays.get(x):
        return 1
    else:
        return 0
  
  
df['holidays'] = df['date'].apply(is_holiday)
df.head()


Output:

Addition of feature including holiday information

Now let’s remove the columns which are not useful for us.

Python3




df.drop(['datetime', 'date'],
        axis=1,
        inplace=True)


There may be some other relevant features as well which can be added to this dataset but let’s try to build a build with these ones and try to extract some insights as well.

Ola Bike Ride Request Forecast using ML

From telling rickshaw-wala where to go, to tell him where to come we have grown up. Yes, we are talking about online cab and bike facility providers like OLA and Uber. If you had used this app some times then you must have paid some day less and someday more for the same journey. But have you ever thought what is the reason behind it? It is because of the high demand at some hours. this is not the only factor but this is one of them.

Similar Reads

Ola Bike Ride Request Forecast using ML

In this article, we will try to predict ride-request for a particular hour using machine learning. One can refer to the below explanation for the column names in the dataset and their values as well....

Importing Libraries and Dataset

Python libraries make it easy for us to handle the data and perform typical and complex tasks with a single line of code....

Feature Engineering

...

Exploratory Data Analysis

...

Model Training

...

Contact Us