How to Stack Multiple Pandas DataFrames?
In this article, we will see how to stack Multiple Pandas Dataframe. Stacking means appending the dataframe rows to the second dataframe and so on. If there are 4 dataframes, then after stacking the result will be a single dataframe with an order of dataframe1,dataframe2,dataframe3,dataframe4.
Pandas Concat DataFrame
Concatenating two Pandas DataFrames refers to the process of combining them into a single DataFrame. This is a powerful technique for combining data from different sources or different periods into one easy-to-analyze dataset.
There are two main ways to concatenate DataFrames in Pandas:
1. Using pd.concat()
This is the most flexible and widely used method. You can specify the axis (0 for rows, 1 for columns) along which you want to concatenate the frames, and you can control how indexes are handled (e.g., ignoring them, keeping the first or last index).
Syntax: pandas.concat([first_dataframe,second_dataframe,third_dataframe,………….,last_dataframe], ignore_index=True,axis)
Parameters:
- dataframes are the input dataframes to be stacked
- ignore_index is used to ignore the index values of the input dataframes
- axis=0 specifies vertical stacking
- axis=1 specifies horizontal stacking
Note: If the ignore_index parameter is not set to true means then it will take the given indexes which leads to the wrong stacking of the dataframes
Concatenate Two Pandas DataFrames Vertically using concat()
In this example, Two DataFrames (data1
and data2
) are created using the pd.DataFrame()
constructor, each containing columns ‘name’ and ‘subjects’ with corresponding data. The pd.concat()
function is used to concatenate the two DataFrames vertically (axis=0
). The ignore_index=True
parameter is set to reset the index of the resulting DataFrame.
Python3
import pandas as pd # create first dataframe data1 = pd.DataFrame({ 'name' : [ 'sravan' , 'bobby' , 'ojaswi' , 'rohith' , 'gnanesh' ], 'subjects' : [ 'java' , 'python' , 'php' , 'java' , '.NET' ]}) # create second dataframe data2 = pd.DataFrame({ 'name' : [ 'gopi' , 'harsha' , 'ravi' , 'uma' , 'deepika' ], 'subjects' : [ 'c/c++' , 'html/css' , 'dbms' , 'java' , 'IOT' ]}) # stack the two DataFrames print (pd.concat([data1, data2], ignore_index = True , axis = 0 )) |
Output:
name subjects
0 sravan java
1 bobby python
2 ojaswi php
3 rohith java
4 gnanesh .NET
5 gopi c/c++
6 harsha html/css
7 ravi dbms
8 uma java
9 deepika IOT
Concatenate Multiple DataFrames vertically in Pandas using pandas.concat()
In this example, we will see Pandas Concat Multiple Dataframes Vertically. Four DataFrames (data1
, data2
, data3
, and data4
) are created using the pd.DataFrame()
constructor. Each DataFrame contains ‘name’ and ‘subjects’ columns with corresponding data. The pd.concat()
function is used to concatenate the four DataFrames vertically (axis=0
). The ignore_index=True
parameter is set to reset the index of the resulting DataFrame.
Python3
import pandas as pd # create first dataframe data1 = pd.DataFrame({ 'name' : [ 'sravan' , 'bobby' , 'ojaswi' , 'rohith' , 'gnanesh' ], 'subjects' : [ 'java' , 'python' , 'php' , 'java' , '.NET' ]}) # create second dataframe data2 = pd.DataFrame({ 'name' : [ 'gopi' , 'harsha' , 'ravi' , 'uma' , 'deepika' ], 'subjects' : [ 'c/c++' , 'html/css' , 'dbms' , 'java' , 'IOT' ]}) # create third dataframe data3 = pd.DataFrame( { 'name' : [ 'ragini' , 'latha' ], 'subjects' : [ 'java' , 'python' ]}) # create fourth dataframe data4 = pd.DataFrame( { 'name' : [ 'gowri' , 'jyothika' ], 'subjects' : [ 'java' , 'IOT' ]}) # stack the four DataFrames print (pd.concat([data1, data2, data3, data4], ignore_index = True ,axis = 0 )) |
Output:
name subjects 0 sravan java 1 bobby python 2 ojaswi php 3 rohith java 4 gnanesh .NET 5 gopi c/c++ 6 harsha html/css 7 ravi dbms 8 uma java 9 deepika IOT 10 ragini java 11 latha python 12 gowri java 13 jyothika IOT
Concatenating DataFrames horizontally in Pandas using concat()
In this example, Four DataFrames (data1
, data2
, data3
, and data4
) are created using the pd.DataFrame()
constructor. Each DataFrame contains ‘name’ and ‘subjects’ columns with corresponding data. The pd.concat()
function is used to concatenate the four DataFrames horizontally (axis=1
). This means the columns are stacked side by side. The ignore_index=True
parameter is set to reset the index of the resulting DataFrame.
Python3
import pandas as pd # create first dataframe data1 = pd.DataFrame({ 'name' : [ 'sravan' , 'bobby' , 'ojaswi' , 'rohith' , 'gnanesh' ], 'subjects' : [ 'java' , 'python' , 'php' , 'java' , '.NET' ]}) # create second dataframe data2 = pd.DataFrame({ 'name' : [ 'gopi' , 'harsha' , 'ravi' , 'uma' , 'deepika' ], 'subjects' : [ 'c/c++' , 'html/css' , 'dbms' , 'java' , 'IOT' ]}) # create third dataframe data3 = pd.DataFrame( { 'name' : [ 'ragini' , 'latha' ], 'subjects' : [ 'java' , 'python' ]}) # create fourth dataframe data4 = pd.DataFrame( { 'name' : [ 'gowri' , 'jyothika' ], 'subjects' : [ 'java' , 'IOT' ]}) # stack the four DataFrames horizontally print (pd.concat([data1, data2, data3, data4], axis = 1 , ignore_index = True )) |
Output:
0 1 2 3 4 5 6 7
0 sravan java gopi c/c++ ragini java gowri java
1 bobby python harsha html/css latha python jyothika IOT
2 ojaswi php ravi dbms NaN NaN NaN NaN
3 rohith java uma java NaN NaN NaN NaN
4 gnanesh .NET deepika IOT NaN NaN NaN NaN
2. Using df.append(other_df)
This method appends the rows of the second DataFrame to the bottom of the first DataFrame. It’s a simpler approach for basic concatenation but offers less flexibility than pd.concat()
.
Syntax: first_dataframe.append([second_dataframe,…,last_dataframe],ignore_index=True)
Parameters:
first_dataframe
: This is the original DataFrame to append other DataFrames..append()
: This is the method used to append or concatenate DataFrames.[second_dataframe, ..., last_dataframe]
: This part consists of a list containing one or more DataFrames that we want to append to thefirst_dataframe
.ignore_index=True
: When set toTrue
, this parameter resets the index of the resulting DataFrame.
Stack Multiple Dataframes using append() method
In this example, Four DataFrames (data1
, data2
, data3
, and data4
) are created using the pd.DataFrame()
constructor. Each DataFrame contains ‘name’ and ‘subjects’ columns with corresponding data. The append()
method is used on the first DataFrame (data1
) to append the remaining three DataFrames vertically. The ignore_index=True
parameter is set to reset the index of the resulting DataFrame.
Python3
import pandas as pd # create first dataframe data1 = pd.DataFrame({ 'name' : [ 'sravan' , 'bobby' , 'ojaswi' , 'rohith' , 'gnanesh' ], 'subjects' : [ 'java' , 'python' , 'php' , 'java' , '.NET' ]}) # create second dataframe data2 = pd.DataFrame({ 'name' : [ 'gopi' , 'harsha' , 'ravi' , 'uma' , 'deepika' ], 'subjects' : [ 'c/c++' , 'html/css' , 'dbms' , 'java' , 'IOT' ]}) # create third dataframe data3 = pd.DataFrame( { 'name' : [ 'ragini' , 'latha' ], 'subjects' : [ 'java' , 'python' ]}) # create fourth dataframe data4 = pd.DataFrame( { 'name' : [ 'gowri' , 'jyothika' ], 'subjects' : [ 'java' , 'IOT' ]}) # stack the four DataFrames using append() print (data1.append([data2, data3, data4], ignore_index = True )) |
Output:
name subjects 0 sravan java 1 bobby python 2 ojaswi php 3 rohith java 4 gnanesh .NET 5 gopi c/c++ 6 harsha html/css 7 ravi dbms 8 uma java 9 deepika IOT 10 ragini java 11 latha python 12 gowri java 13 jyothika IOT
Conclusion
Data manipulation is crucial for effective data analysis, and stacking multiple Pandas DataFrames is a fundamental operation in this process. Whether you’re dealing with diverse datasets or consolidating information for streamlined analysis, knowing how to stack DataFrames is a valuable skill.