Python | Delete rows/columns from DataFrame using Pandas.drop()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages which makes importing and analyzing data much easier. In this article, we will how to delete a row in Excel using Pandas as well as delete a column from DataFrame using Pandas.
Pandas DataFrame drop() Method Syntax
Syntax: DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors=’raise’)
Parameters:
- labels: String or list of strings referring row or column name.
- axis: int or string value, 0 ‘index’ for Rows and 1 ‘columns’ for Columns.
- index or columns: Single label or list. index or columns are an alternative to axis and cannot be used together. level: Used to specify level in case data frame is having multiple level index.
- inplace: Makes changes in original Data Frame if True.
- errors: Ignores error if any value from the list doesn’t exists and drops rest of the values when errors = ‘ignore’
Return type: Dataframe with dropped values
Python Drop Function in Pandas
Pandas provide data analysts with a way to delete and filter data frames using dataframe.drop()
the method. Rows or columns can be removed using an index label or column name using this method.
Deleting Rows and Columns from Pandas DataFrame
Below are some ways and example by which we can delete a row in Excel using Pandas in Python.
Dropping Rows in Pandas by Index Label
In this code, A list of index labels is passed and the rows corresponding to those labels are dropped using .drop() method. To download the CSV used in the code, click here.
Python3
# importing pandas module import pandas as pd # making data frame from csv file data = pd.read_csv( "nba.csv" , index_col = "Name" ) print (data.head( 5 )) |
Output: Data Frame before Dropping values
Name Team Number Position Age Height Weight College Salary Avery Bradley Boston Celtics 0.0 PG 25.0 6-2 180.0 Texas 7730337.0 Jae Crowder Boston Celtics 99.0 SF 25.0 6-6 235.0 Marquette 6796117.0 John Holland Boston Celtics 30.0 SG 27.0 6-5 205.0 Boston University NaN R.J. Hunter Boston Celtics 28.0 SG 22.0 6-5 185.0 Georgia State 1148640.0 Jonas Jerebko Boston Celtics 8.0 PF 29.0 6-10 231.0 NaN 5000000.0
Applying the drop function.
Python3
# dropping passed values data.drop([ "Avery Bradley" , "John Holland" , "R.J. Hunter" ], inplace = True ) # display print (data) |
Output: Data Frame after Dropping values
As shown in the output before, the new output doesn’t have the passed values. Those values were dropped and the changes were made in the original data frame since inplace was True.
Team Number Position Age Height Weight College Salary
Name
Jae Crowder Boston Celtics 99.0 SF 25.0 6-6 235.0 Marquette 6796117.0
Jonas Jerebko Boston Celtics 8.0 PF 29.0 6-10 231.0 NaN 5000000.0
Amir Johnson Boston Celtics 90.0 PF 29.0 6-9 240.0 NaN 12000000.0
Jordan Mickey Boston Celtics 55.0 PF 21.0 6-8 235.0 LSU 1170960.0
Kelly Olynyk Boston Celtics 41.0 C 25.0 7-0 238.0 Gonzaga 2165160.0
Dropping Columns in Pandas with Column Name
In this code, Passed columns are dropped using column names. axis
parameter is kept 1 since 1 refers to columns.
Python3
# importing pandas module import pandas as pd # making data frame from csv file data = pd.read_csv( "nba.csv" , index_col = "Name" ) print (data.head()) |
Output: Data Frame before Dropping Columns
Team Number Position Age Height Weight College Salary
Name
Avery Bradley Boston Celtics 0.0 PG 25.0 6-2 180.0 Texas 7730337.0
Jae Crowder Boston Celtics 99.0 SF 25.0 6-6 235.0 Marquette 6796117.0
John Holland Boston Celtics 30.0 SG 27.0 6-5 205.0 Boston University NaN
R.J. Hunter Boston Celtics 28.0 SG 22.0 6-5 185.0 Georgia State 1148640.0
Jonas Jerebko Boston Celtics 8.0 PF 29.0 6-10 231.0 NaN 5000000.0
Applying drop function.
Python3
# dropping passed columns data.drop([ "Team" , "Weight" ], axis = 1 , inplace = True ) # display print (data.head()) |
Output: Data Frame after Dropping Columns
As shown in the output images, the new output doesn’t have the passed columns. Those values were dropped since the axis was set equal to 1 and the changes were made in the original data frame since inplace was True.
Number Position Age Height College Salary
Name
Avery Bradley 0.0 PG 25.0 6-2 Texas 7730337.0
Jae Crowder 99.0 SF 25.0 6-6 Marquette 6796117.0
John Holland 30.0 SG 27.0 6-5 Boston University NaN
R.J. Hunter 28.0 SG 22.0 6-5 Georgia State 1148640.0
Jonas Jerebko 8.0 PF 29.0 6-10 NaN 5000000.0