Visualization

Now let us visualize the data using some pie charts and histograms to get a proper understanding of the data.

Let us first visualize the number of survivors and death counts.

Python3




f, ax = plt.subplots(1, 2, figsize=(12, 4))
train['Survived'].value_counts().plot.pie(
    explode=[0, 0.1], autopct='%1.1f%%', ax=ax[0], shadow=False)
ax[0].set_title('Survivors (1) and the dead (0)')
ax[0].set_ylabel('')
sns.countplot('Survived', data=train, ax=ax[1])
ax[1].set_ylabel('Quantity')
ax[1].set_title('Survivors (1) and the dead (0)')
plt.show()


 

Sex feature

Python3




f, ax = plt.subplots(1, 2, figsize=(12, 4))
train[['Sex', 'Survived']].groupby(['Sex']).mean().plot.bar(ax=ax[0])
ax[0].set_title('Survivors by sex')
sns.countplot('Sex', hue='Survived', data=train, ax=ax[1])
ax[1].set_ylabel('Quantity')
ax[1].set_title('Survived (1) and deceased (0): men and women')
plt.show()


 

Titanic Survival Prediction Using Machine Learning

In this article, we will learn to predict the survival chances of the Titanic passengers using the given information about their sex, age, etc. As this is a classification task we will be using random forest.

There will be three main steps in this experiment:

  • Feature Engineering
  • Imputation
  • Training and Prediction

Similar Reads

Dataset

The dataset for this experiment is freely available on the Kaggle website. Download the dataset from this link https://www.kaggle.com/competitions/titanic/data?select=train.csv. Once the dataset is downloaded it is divided into three CSV files gender submission.csv train.csv and test.csv...

Importing Libraries and Initial setup

Python3 import warnings import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns plt.style.use('fivethirtyeight') %matplotlib inline warnings.filterwarnings('ignore')...

Visualization

...

Feature Engineering

...

Model Training

...

Prediction

...

Contact Us