Get difference of dataframes using Dplyr in R
In this article, we will discuss How to find the difference between two dataframes using the Dplyr package in the R programming language.
Set difference refers to getting or extracting those values from one dataset that are not present in the other. For this, dplyr supports a function called setdiff(). setdiff() is used to return the data present in the first dataframe but not present in the second dataframe.
Syntax:
setdiff(dataframe1,dataframe2)
Example 1: R program to perform setdiff() operation of the second dataframe with the first dataframe
R
library (dplyr) # create dataframe1 with college # 1 data data1= data.frame (id= c (1,2,3,4,5), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' )) # create dataframe1 with college # 2 data data2= data.frame (id= c (1,2,3,4,5,6,7), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' , 'pinkey' , 'dhanush' )) # set difference of second dataframe # and first dataframe print ( setdiff (data2,data1)) |
Output:
Example 2: R program to perform setdiff() operation of the first dataframe with the second dataframe
R
library (dplyr) # create dataframe1 with college # 1 data data1= data.frame (id= c (1,2,3,4,5), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' )) # create dataframe1 with college # 2 data data2= data.frame (id= c (1,2,3,4,5,6,7), name= c ( 'sravan' , 'ojaswi' , 'bobby' , 'gnanesh' , 'rohith' , 'pinkey' , 'dhanush' )) # set difference of first dataframe # and second dataframe print ( setdiff (data1,data2)) |
Output: