Select Rows with Partial String Match in R DataFrame
In this article, we’ll discuss how to select rows with a partial string match in the R programming language.
Method 1: Using stringr package
The stringr package in R language is used mainly for character manipulations, locale-sensitive operations, altering whitespace, and Pattern-matching. Here we will use its pattern matching functionality to filter data according to partial string match.
Syntax:
df[str_detect(df$column-name, “Pattern”), ]
Parameters:
- df: determines the dataframe that is being used.
- column-name: determines the column in which strings have to be filtered.
- Pattern: determines the string pattern that has to be matched.
Example: This example explains how to extract rows with a partial match using the stringr package.
R
# Load library stringr library ( "stringr" ) # sample dataframe data<- data.frame (names= c ( 'Hello' , 'this' , 'Hell' , 'Beginner' , 'Geek' , 'w3wiki' )) # Filter data with str_detect for strings # containing "Gee" result1<-data[ str_detect (data$name, "Gee" ), ] # print result data result1 # Filter data with str_detect for strings # containing "Hel" result2<-data[ str_detect (data$name, "Hel" ), ] # print result data result2 |
Output:
[1] “Beginner” “Geek” “w3wiki”
[1] “Hello” “Hell”
Method 2: Using data.table package
Data.table is an extension of data.frame package in R. It is widely used for fast aggregation of large datasets, low latency add/update/remove of columns, quicker ordered joins, and a fast file reader. Here we will use its data update of the column functionality to filter data according to partial string match. We will use %like%-operator to select the string match data and will filter data from the dataframe accordingly.
Syntax:
df[df$column-name %like% “Pattern”, ]
Parameter:
- df: determines the dataframe that is being used.
- column-name: determines the column in which strings have to be filtered.
- Pattern: determines the string pattern that has to be matched.
Example: This example explains how to extract rows with a partial match using the data.table package.
R
# load data.table package library ( "data.table" ) # sample dataframe data<- data.frame (names= c ( 'Hello' , 'this' , 'Hell' , 'Beginner' , 'Geek' , 'w3wiki' )) # Filter data with %like% all strings having "Gee" result1<-data[data$name %like% "Gee" , ] # print result data result1 # Filter data with %like% all strings having "Hel" result2<-data[data$name %like% "Hel" , ] # print result data result2 |
Output:
[1] “Beginner” “Geek” “w3wiki”
[1] “Hello” “Hell”