Flattening hierarchical index in pandas dataframe using groupby

Flattening hierarchical index using to_records() function

Whenever we use the groupby function on a single column with multiple aggregation functions we get multiple hierarchical indexes based on the aggregation type. In such cases, the hierarchical index has to be flattened at both levels.

Syntax: pandas.DataFrame.groupby(by=None, axis=0, level=None)

Parameter:

by – mapping function that determines the groups in groupby function

axis – 0 – splits along rows and 1 – splits along columns.

level – if the axis is multi-indexed, groups at a specified level. (int)

Syntax: pandas.DataFrame.agg(func=None, axis=0)

Parameter:

func – specifies the function to be used as aggregation function. (min, max, sum etc)

axis – 0 – function applied to each column and 1- applied to each row.

Example

Import the python pandas package. Create a sample dataframe showing the car sales in two-quarters q1 and q2 as shown. Now use the pandas groupby function to group based on the sum and max of sales on quarter 1 and sum and min of sales 2. The grouped dataframe has multi-indexed columns stored in a list of tuples. Use a for loop to iterate through the list of tuples and join them as a single string. Append the joined strings in the flat_cols list. </li > <li > Now assign the flat_cols list to the column names of the multi-indexed grouped dataframe columns.

Python3

# group by cars based on 
# the sum and max of sales on quarter 1
# and sum
grouped_data = data.groupby(by="cars").agg(
                 {"sale_q1 in Cr": [sum, max],
                'sale_q2 in Cr': [sum, min]})
 
# create an empty list to save the
# names of the flattened columns
flat_cols = []
 
# iterate through this tuples and
# join them as single string
for i in grouped_data.columns:
    flat_cols.append(i[0]+'_'+i[1])
 
# now assign the list of flattened
# columns to the grouped columns.
grouped_data.columns = flat_cols
 
# print the grouped data
print(grouped_data)

Output:

How to flatten a hierarchical index in Pandas

How to flatten a hierarchical index in Pandas DataFrame columns?

In this article, we are going to see the flatten a hierarchical index in Pandas DataFrame columns. Hierarchical Index usually occurs as a result of groupby() aggregation functions. Flatten hierarchical index in Pandas, the aggregated function used will appear in the hierarchical index of the resulting dataframe.

Tags:

#Python pandas-indexing #Python-pandas #Python #python

Using as_index() function

Flattening hierarchical index using to_records() function

Flattening hierarchical index in pandas dataframe using groupby

Python3

How to flatten a hierarchical index in Pandas DataFrame columns?

Similar Reads

Contact Us