Remove All Whitespace in Each DataFrame Column in R
In this article, we will learn how to remove all whitespace in each dataframe column in R programming language.
Sample dataframe in use:
c1 c2 1 Beginner for Beginner 2 cs f 3 r -lang g
Method 1: Using gsub()
In this approach, we have used apply() function to apply a function to each row in a data frame. The function used which is applied to each row in the dataframe is the gsub() function, this used to replace all the matches of a pattern from a string, we have used to gsub() function to find whitespace(\s), which is then replaced by ββ, this removes the whitespaces.
Note: We have wrapped our entire output in as.data.frame() function, it is because the apply() function returns a Matrix object so we need to convert it back into a dataframe.
Syntax: as.data.frame(apply(df,margin, function(x) gsub(β\\s+β, ββ, x)))
Parameters:
df: Dataframe object
margin: dimension on which operation is to be applied
function(x): operation to be applied, gsub() in this case.
gsub(): replaces β\sβ with ββ
Example: R program to remove whitespaces using gsub()
R
df <- data.frame (c1 = c ( " Beginner for" , " cs" , "r -lang " ), c2 = c ( "Beginner " , "f " , " g" )) df_new <- as.data.frame ( apply (df,2, function (x) gsub ( "\\s+" , "" , x))) df_new |
Output:
c1 c2
1 Beginnerfor Beginner
2 cs f
3 r-lang g
Method 2: Using str_remove_all()
We need to first install the package βstringrβ by using install.packages() command and then import it using library() function.
str_remove_all() function takes 2 arguments, first the entire string on which the removal operation is to be performed and the character whose all the occurrences are to be removed.
Syntax: str_remove_all(string, char_to_remove)
Parameter:
string: entire string
char_to_remove: character which is to be removed from the string
Example: R program to remove whitespaces using str_remove_all()
R
library ( "stringr" ) str <- " Welcome to Beginner for Beginner " str_remove_all (str, " " ) |
Output:
[1] βWelcometow3wikiβ
Since we have understood the str_remove_all() function so letβs move on to the approach where we will be applying this function to all the rows of the Dataframe.
Syntax: as.data.frame(apply(df,margin, str_remove_all, β β))
Parameters:
df: Dataframe object
margin: dimension on which operation is to be applied
str_remove_all: operation to be applied
In this approach, we have used apply() function to apply a function to each row in a data frame. The function used which is applied to each row in the dataframe is the str_remove_all() function. We have passed whitespace β β as an argument, this function removes all the occurrences of β β, from each row.
Note: We have wrapped our entire output in as.data.frame() function, it is because the apply() function returns a Matrix object so we need to convert it back into a dataframe.
Example: R program to remove whitespaces from dataframe using str_remove_all()
R
library ( "stringr" ) df <- data.frame (c1 = c ( " Beginner for" , " cs" , "r -lang " ), c2 = c ( "Beginner " , "f " , " g" )) df_new <- as.data.frame ( apply (df,2, str_remove_all, " " )) df_new |
Output:
c1 c2
1 Beginnerfor Beginner
2 cs f
3 r-lang g
Method 3: Using str_replace_all()
str_replace_all() function takes 3 arguments. First, it takes the input string on which the operation has to be performed. Then it takes the pattern which is to be replaced and the replacement value with which it is to be replaced. Here we have the pattern β β is replaced by ββ.
Syntax: as.data.frame(apply(df,2, function(x) str_replace_all(string=x, pattern=β β, repl=ββ)))
Parameters:
df: Dataframe object
margin: dimension on which operation is to be applied
function(x): operation to be applied, str_replace_all() in this case.
str_replace_all(): replaces all the occurrences of β β with ββ
In this approach, we have used apply() function to apply a function to each row in a data frame. The function used which is applied to each row in the dataframe is the str_replace_all() function, this used to replace all the matches of a pattern from a string, we have used to str_replace_all() function to find whitespace(β β), which is then replaced by ββ, this removes the whitespaces.
Note: We have wrapped our entire output in as.data.frame() function, it is because the apply() function returns a Matrix object so we need to convert it back into a dataframe.
Example: R program to remove whitespaces using str_replace_all()
R
library (stringr) df <- data.frame (c1 = c ( " Beginner for" , " cs" , "r -lang " ), c2 = c ( "Beginner " , "f " , " g" )) df_new <- as.data.frame ( apply (df,2, function (x) str_replace_all (string=x, pattern= " " , repl= "" ))) df_new |
Output:
c1 c2
1 Beginnerfor Beginner
2 cs f
3 r-lang g