Pandas Replace Multiple Values in Python
Replacing multiple values in a Pandas DataFrame or Series is a common operation in data manipulation tasks. Pandas provides several versatile methods for achieving this, allowing you to seamlessly replace specific values with desired alternatives. In this context, we will explore various approaches to replace multiple values in Python using Pandas.
Replacing Multiple Values in Pandas
There are times when working with data in a Pandas DataFrame or Series that call for replacing particular values in order to improve data analysis or consistency. Pandas provides a number of methods for changing values in a column of a dataframe, offering versatility and effectiveness in managing a range of data replacement requirements.
Ways to Replace Multiple Values in Python Using Pandas are:
- Using the replace() Method
- Using map() method for single column
- Using apply() method
Using the Replace Method
One of Pandas most useful tools is the replace() method, which allows to substitute desired values with specified ones. It is a flexible option for a range of situations because it may be applied to a single column or the full DataFrame.
In the code, we will replaces specific values (2 with 200 and 4 with 400) in column ‘A’ using the replace method.
import pandas as pd
data = {'A': [1, 2, 3, 4, 5], 'B': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)
print("Before Replacing:\n", df)
df.replace({2: 200, 4: 400}, inplace=True)
print("After Replacing:\n", df)
Output:
Before Replacing:
A B
0 1 10
1 2 20
2 3 30
3 4 40
4 5 50
After Replacing:
A B
0 1 10
1 200 20
2 3 30
3 400 40
4 5 50
We can observe that the value ‘2’ in column “A” is replaced by ‘200’ and the value ‘4’ in column “A” is replaced by ‘400’. Since, ‘inplace=True’, the changes are made directly in the DataFrame ‘df’. The other values and the structure of the DataFrame remain unchanged.
Using the Map Method for Single Column
Values in a particular column can be changed using the map() method in conjunction with a dictionary. For targeted replacements inside a single series, this method works well.
Using the map method for a dictionary (replacement_dict) to replace values in column ‘A’ (1 with ‘one’, 2 with ‘two’, and 3 with ‘three’
import pandas as pd
data = {'A': [1, 2, 3]}
df = pd.DataFrame(data)
print("Before Replacing:\n", df)
df['A'] = df['A'].map({1: 'one', 2: 'two', 3: 'three'})
print("After Replacing:\n", df)
Output:
Before Replacing:
A
0 1
1 2
2 3
After Replacing:
A
0 one
1 two
2 three
Using apply Method
In Pandas, you can use the apply()
method along with a custom function to replace multiple values in a DataFrame or Series.
In this example, the replace_values
function is applied to each element in the DataFrame using the apply()
method. The function checks for specific values (‘apple’ and ‘banana’ in this case) and replaces them accordingly.
import pandas as pd
data = {'A': ['apple', 'potato', 'orange']}
df = pd.DataFrame(data)
print("Before Replacing:\n", df)
def replace_values(x):
if x == 'apple' or x == 'orange':
return 'fruit'
elif x == 'potato':
return 'vegetable'
else:
return x
df['A'] = df['A'].apply(replace_values)
print("After Replacing:\n", df)
Output:
Before Replacing
A
0 apple
1 potato
2 orange
After Replacing
A
0 fruit
1 vegetable
2 fruit
Conclusion
In conclusion, Pandas offers a robust set of methods for replacing multiple values in Python, catering to various data manipulation scenarios. The replace method stands out as a versatile and straightforward choice, allowing for global or column-specific substitutions effortlessly.