How to Replace Multiple Values in Data Frame Using dplyr
Replacing multiple values in a data frame involves substituting specific values in one or more columns with new values. This process is often necessary to standardize or clean the data before analysis. In R, the dplyr package offers efficient functions for data manipulation, including mutate() for creating new variables with modified values and case_when() or recode() for replacing multiple values based on conditions in the R Programming Language.
Replace Multiple Value Using mutate() and case_when()
library(dplyr)
# Example dataset
data <- tibble(
id = 1:5,
category = c("A", "B", "A", "C", "B"),
value = c(10, 15, 20, 25, 30)
)
data
# Replace multiple values in 'category' column
data_replaced <- data %>%
mutate(category = case_when(
category == "A" ~ "Alpha",
category == "B" ~ "Beta",
category == "C" ~ "Gamma",
TRUE ~ category # Keep other values unchanged
))
# View the resulting dataset
print(data_replaced)
Output:
id category value
1 1 A 10
2 2 B 15
3 3 A 20
4 4 C 25
5 5 B 30
id category value
1 1 Alpha 10
2 2 Beta 15
3 3 Alpha 20
4 4 Gamma 25
5 5 Beta 30
In this example, case_when() within mutate() is used to replace multiple values in the ‘category’ column based on specified conditions.
Replace Multiple Value Using mutate() and recode()
library(dplyr)
# Example dataset
data <- data.frame(
id = 1:5,
category = c("A", "B", "A", "C", "B"),
value = c(10, 15, 20, 25, 30)
)
data
# Replace multiple values in 'category' column
data_replaced <- data %>%
mutate(category = recode(category,
"A" = "Apple",
"B" = "Boys",
"C" = "Cats"))
# View the resulting dataset
print(data_replaced)
Output:
id category value
1 1 A 10
2 2 B 15
3 3 A 20
4 4 C 25
5 5 B 30
id category value
1 1 Apple 10
2 2 Boys 15
3 3 Apple 20
4 4 Cats 25
5 5 Boys 30
Here, recode() within mutate() is used to replace multiple values in the ‘category’ column directly, providing a more concise approach.
Conclusion
Using dplyr, you can efficiently replace multiple values in a data frame using functions like case_when() or recode() within mutate(). Whether you prefer the flexibility of case_when() or the simplicity of recode(), dplyr provides intuitive tools for data manipulation tasks in R. Choose the approach that best fits your requirements and coding style.