Fuel Efficiency Forecasting with CatBoost

Neural Networks: Which Cost Function to Use?

The automobile sector is continuously looking for new and creative ways to cut fuel use in its pursuit of economy, and sustainability. Comprehending car fuel usage has become more crucial due to the increase in gas costs and the increased emphasis on environmental sustainability. A technique for this would be to forecast and examine fuel use using machine learning techniques. In this blog article, the potent machine learning tool CatBoost is introduced along with its potential applications for modeling automobile fuel usage. With an emphasis on simplicity, this post will walk you through the basic ideas, offer examples to help you understand, and list the actions required to put this solution into practice. Starting with the fundamentals, we will gradually increase your understanding by going over important ideas.

Table of Content

Fuel consumption in vehicles using Catboost
The Power of CatBoost
Steps to Predict Fuel Consumptions Using CatBoost
Develop a CatBoost Model for Fuel consumptions in vehicle
Conclusion

Fuel consumption in vehicles using Catboost

Within the automobile sector, fuel consumption prediction plays a crucial role in driving driver behaviour optimization as well as vehicle design. These predictions may now be made with greater ease because of machine learning models, especially gradient-boosting methods. We will look at using CatBoost, a high-performance gradient boosting library, to forecast car fuel use in this blog article. Fundamental ideas will be discussed, along with a step-by-step tutorial on creating a predictive model. Even as a novice, you will have a firm grasp of how to utilize CatBoost for this purpose by the conclusion of this essay.

The Power of CatBoost

CatBoost short for “Category Boosting,” it’s an open-source gradient boosting library developed by Yandex that excels in dealing with categorical features and is known for its speed and accuracy. When managing data points that reflect groups or categories (such as car type or fuel type), it works very effectively. “Category Boosting” is what CatBoost stands for and it’s well-known for being very effective and user-friendly. There are several benefits that CatBoost provides for fuel consumption prediction:

High Accuracy: When compared to other machine learning algorithms, it can estimate fuel usage with extraordinary precision.
Handles Complex Data: CatBoost can process a wide range of data types, including text (car model) and numerical data (engine size).
Speedy and Efficient: It has a reputation for being both speedy and efficient, which makes it suitable for real-world applications.

Steps to Predict Fuel Consumptions Using CatBoost

Prerequisite:

First, we need to install the catboost in our local system

!pip install catboost

1. Data Collection and Preprocessing

Collect your dataset first. The vehicle type, engine size, fuel type, weight and historical fuel consumption records are some of the parameters that your dataset may include for fuel consumption prediction.

2. Data Cleaning

Handle missing values, eliminate duplicates, and encode category variables to clean up the data. This phase is made easier by the fact that CatBoost can directly handle category data.

3. Splitting the Data

Split your data into training and test sets to evaluate the model’s performance.

4. Training the Model

Use the training dataset to educate the CatBoost model. Declare the categorized qualities and other parameters.

5. Evaluating the Model

Use measures such as Mean Absolute Error (MAE) or Root Mean Squared Error (RMSE) to assess the model performance.

Develop a CatBoost Model for Fuel consumptions in vehicle

Let’s get our hands dirty and develop a CatBoost model to forecast fuel usage now! Below is an explanation of the procedure:

Step 1: Importing Libraries

We start by importing the necessary libraries.

Python

import numpy as np
import pandas as pd
from catboost import CatBoostRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, mean_squared_error
import matplotlib.pyplot as plt
import seaborn as sns

Step 2: Loading the Dataset

We are going to utilize the UCI Machine Learning Repository’s Fuel Economy dataset.

Python

# Load the dataset
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data'
column_names = ['mpg', 'cylinders', 'displacement', 'horsepower', 'weight', 'acceleration', 'model_year', 'origin', 'car_name']
data = pd.read_csv(url, names=column_names, delim_whitespace=True)

# Display the first few rows of the dataset
print(data.head())

Output:

    mpg  cylinders  displacement horsepower  weight  acceleration  model_year  \
0  18.0          8         307.0      130.0  3504.0          12.0          70   
1  15.0          8         350.0      165.0  3693.0          11.5          70   
2  18.0          8         318.0      150.0  3436.0          11.0          70   
3  16.0          8         304.0      150.0  3433.0          12.0          70   
4  17.0          8         302.0      140.0  3449.0          10.5          70   

   origin                   car_name  
0       1  chevrolet chevelle malibu  
1       1          buick skylark 320  
2       1         plymouth satellite  
3       1              amc rebel sst  
4       1                ford torino

Step 3: Preprocessing the Data

We will choose pertinent features, transform categorical data, and handle missing values.

Python

# Replace '?' with NaN and drop missing values
data.replace('?', np.nan, inplace=True)
data.dropna(inplace=True)

# Convert relevant columns to numeric
data['horsepower'] = data['horsepower'].astype(float)

# Convert categorical columns to category type
data['origin'] = data['origin'].astype('category')

# Define the target variable
data['fuel_consumption'] = 235.215 / data['mpg']  # Convert mpg to l/100km (1 mpg = 235.215 / fuel consumption in l/100km)

# Define features and target variable
X = data[['cylinders', 'displacement', 'horsepower', 'weight', 'acceleration', 'model_year', 'origin']]
y = data['fuel_consumption']

Step 4: Splitting the Data

Split the data into training and test sets.

Python

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 5: Training the CatBoost Model

Using the training data, we initialize and train the CatBoost regressor.

Python

# Initialize the CatBoost regressor
model = CatBoostRegressor(iterations=500, learning_rate=0.1, depth=6, cat_features=['origin'], verbose=100)

# Train the model
model.fit(X_train, y_train)

Output:

0:    learn: 3.6611097    total: 2.74ms    remaining: 1.37s
100:    learn: 0.8048411    total: 374ms    remaining: 1.48s
200:    learn: 0.5390630    total: 724ms    remaining: 1.08s
300:    learn: 0.4066602    total: 915ms    remaining: 605ms
400:    learn: 0.3128946    total: 1.13s    remaining: 278ms
499:    learn: 0.2514340    total: 1.53s    remaining: 0us
<catboost.core.CatBoostRegressor at 0x7a4cd147c910>

Step 6: Making Predictions and Evaluating the Model

We assess the training model’s performance by using it to generate predictions on the test set.

Python

# Make predictions on the test set
y_pred = model.predict(X_test)

# Calculate MAE and RMSE
mae = mean_absolute_error(y_test, y_pred)
rmse = mean_squared_error(y_test, y_pred, squared=False)

print(f'MAE: {mae}')
print(f'RMSE: {rmse}')

Output:

MAE: 0.858216602638723
RMSE: 1.1143796648226478

Step 7: Presenting the Findings

Let’s illustrate the fuel consumption estimates for actual and predicted use.

Python

# Plot predicted vs actual values
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred, alpha=0.5)
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red')
plt.xlabel('Actual Fuel Consumption')
plt.ylabel('Predicted Fuel Consumption')
plt.title('Actual vs Predicted Fuel Consumption')
plt.show()

Output:

Step 8: Interactive User Data Entry for Forecasting

With our interactive tool, customers can enter information about their automobiles and receive a real-time projection of their fuel usage.

Python

# Define widgets for user input
cylinders_widget = widgets.IntSlider(min=3, max=8, step=1, description='Cylinders:')
displacement_widget = widgets.FloatSlider(min=50, max=500, step=10, description='Displacement:')
horsepower_widget = widgets.FloatSlider(min=50, max=250, step=10, description='Horsepower:')
weight_widget = widgets.FloatSlider(min=1500, max=5500, step=100, description='Weight:')
acceleration_widget = widgets.FloatSlider(min=8, max=24, step=1, description='Acceleration:')
model_year_widget = widgets.IntSlider(min=70, max=82, step=1, description='Model Year:')
origin_widget = widgets.Dropdown(options=[1, 2, 3], description='Origin:')

# Define function to make predictions based on user input
def predict_fuel_consumption(cylinders, displacement, horsepower, weight, acceleration, model_year, origin):
    input_data = pd.DataFrame({
        'cylinders': [cylinders],
        'displacement': [displacement],
        'horsepower': [horsepower],
        'weight': [weight],
        'acceleration': [acceleration],
        'model_year': [model_year],
        'origin': [origin]
    })
    # Ensure the categorical feature 'origin' is encoded properly
    input_data['origin'] = input_data['origin'].astype('category')
    prediction = model.predict(input_data)[0]
    print(f'Predicted Fuel Consumption: {prediction:.2f} liters/100km')

# Display widgets
interactive_plot = widgets.interactive(predict_fuel_consumption, 
                                       cylinders=cylinders_widget, 
                                       displacement=displacement_widget, 
                                       horsepower=horsepower_widget, 
                                       weight=weight_widget, 
                                       acceleration=acceleration_widget, 
                                       model_year=model_year_widget, 
                                       origin=origin_widget)

display(interactive_plot)

Output:

Conclusion

Reducing fuel consumption using CatBoost requires knowing the benefits of the algorithm properly prepping the data and continuously refining the model. Even novices may use machine learning to help create more fuel-efficient cars by following the instructions provided in this article.

In this blog article, we looked at how to use a public and synthetic dataset to forecast fuel use using CatBoost. From data preparation to model training and assessment , we went through each step and displayed the outcomes. We also included an interactive element to enable predictions to be made in real time depending on inputs from users. Even if you’re a novice this thorough instruction should help you get started with CatBoost fuel consumption prediction. Have fun with your modeling!

Tags:

#CatBoost #Data Science Blogathon 2024 #AI-ML-DS #Blogathon #Machine Learning #Machine Learning

Neural Networks: Which Cost Function to Use?

Agent-Environment Interface in AI

Fuel Efficiency Forecasting with CatBoost

Fuel consumption in vehicles using Catboost

The Power of CatBoost

Steps to Predict Fuel Consumptions Using CatBoost

Prerequisite:

1. Data Collection and Preprocessing

2. Data Cleaning

3. Splitting the Data

4. Training the Model

5. Evaluating the Model

Develop a CatBoost Model for Fuel consumptions in vehicle

Step 1: Importing Libraries

Step 2: Loading the Dataset

Step 3: Preprocessing the Data

Step 4: Splitting the Data

Step 5: Training the CatBoost Model

Step 6: Making Predictions and Evaluating the Model

Step 7: Presenting the Findings

Step 8: Interactive User Data Entry for Forecasting

Conclusion

Contact Us