Model Development and Evaluation

Now is the time to train some state-of-the-art machine learning models(Logistic Regression, Support Vector Machine, XGBClassifier), and then based on their performance on the training and validation data we will choose which ML model is serving the purpose at hand better.

For the evaluation metric, we will use the ROC-AUC curve but why this is because instead of predicting the hard probability that is 0 or 1 we would like it to predict soft probabilities that are continuous values between 0 to 1. And with soft probabilities, the ROC-AUC curve is generally used to measure the accuracy of the predictions.

Python3




models = [LogisticRegression(), SVC(kernel='poly', probability=True), XGBClassifier()]
 
for i in range(3):
  models[i].fit(X_train, Y_train)
 
  print(f'{models[i]} : ')
  print('Training Accuracy : ', metrics.roc_auc_score(Y_train, models[i].predict_proba(X_train)[:,1]))
  print('Validation Accuracy : ', metrics.roc_auc_score(Y_valid, models[i].predict_proba(X_valid)[:,1]))
  print()


Output:

Performance of different state-of-the-art models.

Among the three models, we have trained XGBClassifier has the highest performance but it is pruned to overfitting as the difference between the training and the validation accuracy is too high. But in the case of the Logistic Regression, this is not the case.

Now let’s plot a confusion matrix for the validation data.

Python3




metrics.plot_confusion_matrix(models[0], X_valid, Y_valid)
plt.show()


Output:

Confusion matrix for the validation data

Bitcoin Price Prediction using Machine Learning in Python

Machine learning proves immensely helpful in many industries in automating tasks that earlier required human labor one such application of ML is predicting whether a particular trade will be profitable or not.

In this article, we will learn how to predict a signal that indicates whether buying a particular stock will be helpful or not by using ML.

Let’s start by importing some libraries which will be used for various purposes which will be explained later in this article.

Similar Reads

Importing Libraries

Python libraries make it very easy for us to handle the data and perform typical and complex tasks with a single line of code....

Importing Dataset

...

Exploratory Data Analysis

The dataset we will use here to perform the analysis and build a predictive model is Bitcoin Price data. We will use OHLC(‘Open’, ‘High’, ‘Low’, ‘Close’) data from 17th July 2014 to 29th December 2022 which is for 8 years for the Bitcoin price....

Feature Engineering

...

Model Development and Evaluation

...

Conclusion:

...

Contact Us