Flattening hierarchical columns using join() and rstrip()
In this example, we use the join() and rstrip() functions to flatten the columns. Usually, when we group a dataframe as hierarchical indexed columns, the columns at multilevel are stored as an array of tuples elements.
Syntax: str.join(iterable)
Explanation: Returns a concatenated string, if iterable, else returns a type error.
Syntax: str.rstrip([chars])
Explanation: Returns a string by splitting the excess trailing spaces (rightmost) to the string.
Code:
Here, we iterate through these tuples by joining the column name and index name of each tuple and storing the resulting flattened columns name in a list. Later, this stored list of flattened columns is assigned to the grouped dataframe.
Python3
# group by cars based on the sum # and max of sales on quarter 1 # and sum and min of sales 2 and # mention as_index is False grouped_data = data.groupby(by = "cars" ).agg({ "sale_q1 in Cr" : [ sum , max ], 'sale_q2 in Cr' : [ sum , min ]}) # use join() and rstrip() function to # flatten the hierarchical columns grouped_data.columns = [ '_' .join(i).rstrip( '_' ) for i in grouped_data.columns.values] print (grouped_data) |
Output:
How to flatten a hierarchical index in Pandas DataFrame columns?
In this article, we are going to see the flatten a hierarchical index in Pandas DataFrame columns. Hierarchical Index usually occurs as a result of groupby() aggregation functions. Flatten hierarchical index in Pandas, the aggregated function used will appear in the hierarchical index of the resulting dataframe.