How to use filter() method In R Language

The filter() method in the tidyverse package is used to apply a range of constraints and conditions to the column values of the data frame in data transformation in R. It filters the data and results in the smaller output returned by the column values satisfying the specified condition. The conditions are specified using the logical operators, and values are validated then.  A data frame can be supplied with the pipe operator and then using the filter condition. 

Syntax: filter(cond1, cond2)

Parameter:

cond1, cond2 – Condition to be applied on data.

The following code snippet indicates that all the column values are returned where the col1 value is greater than 4. Only those rows are returned in the output of the data frame.

R




# Importing tidyverse
library(tidyverse)
 
# Creating a data frame
data_frame = data.frame(
  col1 = c(2,4,1,7,5,3,5,8),
  col2 = letters[1:8],
  col3 = c(0,1,1,1,0,0,0,0),
  col4 = c(9:16))
 
print("Data Frame")
 
# Printing data frame
print(data_frame)
 
# Selecting values where
# col1 value is greater than 4
arr_data_frame <- data_frame %>%
    filter(col1>4)
print("Selecting col1 >4 ")
 
# Printing data frame after
# applying filter
print(arr_data_frame)


Output:

 col1 col2 col3 col4
1    2    a    0    9
2    4    b    1   10
3    1    c    1   11
4    7    d    1   12
5    5    e    0   13
6    3    f    0   14
7    5    g    0   15
8    8    h    0   16

[1] "Selecting col1 >4 "
  col1 col2 col3 col4
1    7    d    1   12
2    5    e    0   13
3    5    g    0   15
4    8    h    0   16

Explanation : 

The output data frame contains the rows of the original data frame where the col1 value is greater than 4. The rows are fetched in the order in which they occur in the original data frame.

Multiple values can also be checked using the filter tag. For instance, the range of values to be checked is specified in the c() vector method. The following code snippet illustrates that the col3 value is checked either for the “there” or “this” value. A row satisfying any of these equality constraints is returned in the final output of the data frame.

R




# Importing tidyverse
library(tidyverse)
 
# Creating a data frame
data_frame = data.frame(
  col1 = c(2,4,1,7,5,3,5,8),
  col2 = letters[1:8],
  col3 = c("this","that","there",
"here","there","this","that","here"),
  col4 = c(9:16))
 
print("Data Frame")
 
# Printing data frame
print(data_frame)
 
# Selecting values where
# col1 value is greater than 4
arr_data_frame <- data_frame %>%
  filter(col3 == c("there","this"))
print("Selecting col1>4 ")
 
# Printing data frame after
# applying filter
print(arr_data_frame)


Output:

  col1 col2  col3 col4
1    2    a  this    9
2    4    b  that   10
3    1    c there   11
4    7    d  here   12
5    5    e there   13
6    3    f  this   14
7    5    g  that   15
8    8    h  here   16

[1] "Selecting col3 value is either there or this"
  col1 col2  col3 col4
1    1    c there   11
2    5    e there   13
3    3    f  this   14

Explanation :

The output data frame contains the rows of the original data frame where the col3 value is either “this” or “there”. The rows are fetched in the order in which they occur in the original data frame.

Multiple conditions can also be checked in the filter method and combined using the comma using filter() method. For instance, the below code checks for the col3 value equal to “there” and col1 value equivalent to 5, respectively. The output data frame contains the rows of the original data frame where the col1 value is equivalent to 5 and the col3 value is equivalent to “there.” The rows are fetched in the order in which they occur in the original data frame.

R




# Importing tidyverse
library(tidyverse)
 
# Creating a data frame
data_frame = data.frame(
  col1 = c(2,4,1,7,5,3,5,8),
  col2 = letters[1:8],
  col3 = c("this","that","there","here",
           "there","this","that","here"),
  col4 = c(9:16))
 
print("Data Frame")
print(data_frame)
 
# Selecting values where
# col3 value is there and col1 is 5
arr_data_frame <- data_frame %>%
    filter(col3=="there",col1==5)
print("Selecting col3 value
    is there and col1 is 5")
print(arr_data_frame)


Output:

  col1 col2  col3 col4
1    2    a  this    9
2    4    b  that   10
3    1    c there   11
4    7    d  here   12
5    5    e there   13
6    3    f  this   14
7    5    g  that   15
8    8    h  here   16

[1] "Selecting col3 value is there and col1 is 5"
  col1 col2  col3 col4
1    5    e there   13

How to Transform Data in R?

In this article, we will learn how to transform data in the R programming language.

Similar Reads

Data Transformation in R

The data transformation in R is mostly handled by the external packages tidyverse and dplyr . These packages provide many methods to carry out the data simulations. There are a large number of ways to simulate data transformation in R. These methods are widely available using these packages, which can be downloaded and installed using the following command :...

Method 1: Using Arrange() method

For data transformation in R, we will use The arrange() method, to create an order for the sequence of the observations given. It takes a single column or a set of columns as the input to the method and creates an order for these....

Method 2: Using select() method

...

Method 3: Using filter() method

...

Method 4: Using spread() method

Data transformation in R of the data frame can also be fetched using the select() method in tidyverse package. The columns are fetched in the order of their specification in the argument list of the select() method call. This method results in a subset of the data frame as the output. The following syntax is followed :...

Method 5: Using mutate() method

...

Method 6: Using group_by() and summarise() method

...

Method 7: Using the gather() method

The filter() method in the tidyverse package is used to apply a range of constraints and conditions to the column values of the data frame in data transformation in R. It filters the data and results in the smaller output returned by the column values satisfying the specified condition. The conditions are specified using the logical operators, and values are validated then.  A data frame can be supplied with the pipe operator and then using the filter condition....