Installation of Pandas Profiling

Pandas Profiling can be easily installed using the following command

pip install pandas-profiling

The pandas_profiling library in Python includes a method named as ProfileReport() which generates a basic report on the input DataFrame. 

The report consists of the following:

  • DataFrame overview,
  • Each attribute on which DataFrame is defined,
  • Correlations between attributes (Pearson Correlation and Spearman Correlation), and
  • A sample of DataFrame.

Syntax :

pandas_profiling.ProfileReport(df, **kwargs)

Arguments                                                                   

Type                                                  

Description

df

DataFrame

Data to be analyzed

bins

int

Number of bins in histogram. The default is 10.

check_correlation

boolean

Whether or not to check correlation. It’s `True` by default.

correlation_threshold

float

Threshold to determine if the variable pair is correlated. The default is 0.9.

correlation_overrides

list

Variable names not to be rejected because they are correlated. There is no variable in the list (`None`) by default.

check_recoded

boolean

Whether or not to check recoded correlation (memory heavy feature). Since it’s an expensive computation it can be activated for small datasets. `check_correlation` must be true to disable this check. It’s `False` by default.

pool_size

int

Number of workers in thread pool. The default is equal to the number of CPU.

Now, let’s take an example, we will create our own data frame and will have a look at how pandas profiling can help in understanding the dataset more. Before that let us import the pandas_profiling.

Pandas Profiling in Python

Pandas is a very vast library that offers many functions with the help of which we can understand our data. Pandas profiling provides a solution to this by generating comprehensive reports for datasets that have numerous features. These reports can be customized according to specific requirements. In this article, we will dive into this library’s functionalities and explore its various features like:

  • Installation of Pandas Profiling
  • Importing Pandas Profiling
  • Generating Profile Report
  • Exploring Profile Report Generated
    • Overview
    • Variables
    • Correlations
    • Missing Values
    • Sample
  • Saving the Profile Report

Similar Reads

Installation of Pandas Profiling

Pandas Profiling can be easily installed using the following command...

Importing Pandas Profiling

Python3 # importing packages import pandas as pd from pandas_profiling import ProfileReport...

Generating Profile Report

...

Exploring the Profile Report Generated

...

Saving the Profile Report

For generating the profile report we will simply use the Profile Report from pandas_profile and input will the dataframe....

Contact Us