Logistic Regression Implementation
In this implementation we will find prediction of the logistic regression model without feature scaling and then compare the accuracy of the model with integrating feature scaling techniques.
We have imported all necessary libraries in the implementation required for the tutorial.
import pandas as pd
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import RobustScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.pipeline import Pipeline
dataset = load_breast_cancer()
x = dataset.data
y = dataset.target
x_train, x_test, y_train, y_test = train_test_split(
x, y, test_size=0.2, random_state=42)
model = LogisticRegression()
model.fit(x_train, y_train)
y_pred = model.predict(x_test)
print("Accuracy with no feature scaling :", accuracy_score(y_pred, y_test))
Output:
Accuracy with no feature scaling : 0.9649122807017544
For more refer to: Implementation of logistic regression from scratch
Logistic Regression with Min-Max Scaler
Min-Max scaling makes sure that all features are on a similar scale, typically between 0 and 1. It does this by subtracting the minimum value of the feature and then dividing by the difference between the maximum and minimum values. This ensures that the minimum value becomes 0 and the maximum value becomes 1, while other values are proportionally scaled in between.
scaler = MinMaxScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_test_scaled = scaler.transform(x_test)
model = LogisticRegression()
model.fit(x_train_scaled, y_train)
y_pred = model.predict(x_test_scaled)
print("Accuracy after applying Min-Max scaler :", accuracy_score(y_pred, y_test))
Output:
Accuracy after applying Min-Max scaler : 0.9824561403508771
Logistic Regression with Standard Scaler
Standardization or z-score normalization centers the data around 0 and scales it to have a standard deviation of 1 by subtracting the mean of the feature from each value and then dividing by the standard deviation. This ensures that the mean of the features becomes 0 and the standard deviation becomes 1, making the data more gaussian like.
# initializing scaler object
scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_test_scaled = scaler.transform(x_test)
model = LogisticRegression()
model.fit(x_train_scaled, y_train)
y_pred = model.predict(x_test_scaled)
print("Accuracy after applying Standard scaler :", accuracy_score(y_pred, y_test))
Output:
Accuracy after applying Standard scaler : 0.9736842105263158
Logistic Regression with Robust Scaler
Robust scaling is particularly useful when dealing with outliers or non-normal distributions. It scales based on the median and interquartile range (IQR) rather than the mean and standard deviation. This makes it more robust to the outliers, as the median is less effected by the extreme values. It subtracts the median of the feature and then divides by the IQR, ensuring that the data is sealed proportionally while being less influenced by outliers.
# initializing scaler object
scaler = RobustScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_test_scaled = scaler.transform(x_test)
model = LogisticRegression()
model.fit(x_train_scaled, y_train)
y_pred = model.predict(x_test_scaled)
print("Accuracy after applying Robust scaler :", accuracy_score(y_pred, y_test))
Output:
Accuracy after applying Standard scaler : 0.9736842105263158
Logistic Regression with Feature Scaling Ensemble
While traditional feature scaling methods work well in many cases, they might not suffice when dealing with complex datasets containing features with vastly different scales or non-linear relationships. Feature scaling ensemble techniques offer a more sophisticated approach to address these challenges.
Ensemble methods combine multiple models to produce better predictive performance than any of the individual models alone. This approach aims to harness the strengths of different scaling methods while mitigating their respective weaknesses.
The classifier 1, 2 and 3 are Logistic Regression with Min-Max Scaler, Standard Scaler and Robust Scaler respectively. In this implementation, we aim to enhance the predictive performance of logistic regression by employing a feature scaling ensemble approach called Voting Classifier. Traditional logistic regression models may suffer from suboptimal performance when features exhibit varying scales. Feature scaling ensemble addresses this issue by leveraging multiple scaling techniques tailored to different subsets of features.
The use of pipelines facilitates a streamlined workflow by encapsulating feature scaling and logistic regression modeling within a single entity. Pipelines ensure consistency in preprocessing steps across training and testing datasets, simplifying code maintenance and reducing the risk of data leakage.
# Define pipelines for different scaling techniques
minmax_pipeline = Pipeline([
('scaler', MinMaxScaler()),
('clf', LogisticRegression())
])
standard_pipeline = Pipeline([
('scaler', StandardScaler()),
('clf', LogisticRegression())
])
robust_pipeline = Pipeline([
('scaler', RobustScaler()),
('clf', LogisticRegression())
])
Utilizing Voting Classifier for Aggregation
To consolidate the predictions from individual logistic regression models trained on scaled feature subsets, we employ a voting classifier. The voting classifier aggregates the predictions using either a hard or soft voting strategy. Hard voting selects the class with the most frequent prediction across all models, while soft voting considers the probability scores from each model, providing a more nuanced decision.
- Hard Voting: Hard voting is effective when the models in the ensemble have relatively high individual accuracies and diverse decision boundaries.
- Soft Voting: Soft voting considers the confidence levels (probability scores) of individual model predictions, resulting in a more nuanced decision. This strategy is beneficial when the models in the ensemble provide probability estimates, enabling a finer-grained aggregation of predictions.
# Initialize Voting Classifier with Logistic Regression models
voting_classifier_hard = VotingClassifier(estimators=[('model_minmax', minmax_pipeline), (
'model_standard', standard_pipeline), ('model_robust', robust_pipeline)], voting='hard')
voting_classifier_soft = VotingClassifier(estimators=[('model_minmax', minmax_pipeline), (
'model_standard', standard_pipeline), ('model_robust', robust_pipeline)], voting='soft')
Training and predictions of the Voting Classifier
voting_classifier_hard.fit(x_train, y_train)
voting_classifier_soft.fit(x_train, y_train)
y_pred_voting_hard = voting_classifier_hard.predict(x_test)
y_pred_voting_soft = voting_classifier_soft.predict(x_test)
accuracy_voting_hard = accuracy_score(y_test, y_pred_voting_hard)
accuracy_voting_soft = accuracy_score(y_test, y_pred_voting_soft)
print("Accuracy of hard Voting Classifier:", accuracy_voting_hard)
print("Accuracy of soft Voting Classifier:", accuracy_voting_soft)
Output:
Accuracy of hard Voting Classifier: 0.9736842105263158
Accuracy of soft Voting Classifier: 0.9824561403508771
Logistic Regression and the Feature Scaling Ensemble
Logistic Regression is a widely used classification algorithm in machine learning. However, to enhance its performance further specially when dealing with features of different scales, employing feature scaling ensemble techniques becomes imperative.
In this guide, we will dive depth into logistic regression, its significance and how feature sealing ensemble methods can augment its efficiency.
Contact Us