Installation of Pandas Profiling
Pandas Profiling can be easily installed using the following command
pip install pandas-profiling
The pandas_profiling library in Python includes a method named as ProfileReport() which generates a basic report on the input DataFrame.
The report consists of the following:
- DataFrame overview,
- Each attribute on which DataFrame is defined,
- Correlations between attributes (Pearson Correlation and Spearman Correlation), and
- A sample of DataFrame.
Syntax :
pandas_profiling.ProfileReport(df, **kwargs)
Arguments |
Type |
Description |
---|---|---|
df |
DataFrame |
Data to be analyzed |
bins |
int |
Number of bins in histogram. The default is 10. |
check_correlation |
boolean |
Whether or not to check correlation. It’s `True` by default. |
correlation_threshold |
float |
Threshold to determine if the variable pair is correlated. The default is 0.9. |
correlation_overrides |
list |
Variable names not to be rejected because they are correlated. There is no variable in the list (`None`) by default. |
check_recoded |
boolean |
Whether or not to check recoded correlation (memory heavy feature). Since it’s an expensive computation it can be activated for small datasets. `check_correlation` must be true to disable this check. It’s `False` by default. |
pool_size |
int |
Number of workers in thread pool. The default is equal to the number of CPU. |
Now, let’s take an example, we will create our own data frame and will have a look at how pandas profiling can help in understanding the dataset more. Before that let us import the pandas_profiling.
Pandas Profiling in Python
Pandas is a very vast library that offers many functions with the help of which we can understand our data. Pandas profiling provides a solution to this by generating comprehensive reports for datasets that have numerous features. These reports can be customized according to specific requirements. In this article, we will dive into this library’s functionalities and explore its various features like:
- Installation of Pandas Profiling
- Importing Pandas Profiling
- Generating Profile Report
- Exploring Profile Report Generated
- Overview
- Variables
- Correlations
- Missing Values
- Sample
- Saving the Profile Report
Contact Us