How to use Reduce method In R Language
The dataframe can be ordered by group in descending order of their values by the order method. The corresponding dataframe is then accessed using the indexing method by taking the order function’s output in the form of row indexes.
Syntax: order(vec, decreasing = TRUE)
Arguments :
- vec – The dataframe column name to arrange in descending order
- decreasing – The flag to set data in descending order
The Reduce method in base R can also be used to select top n highest rows from each group in a dataframe. This method takes as input a function f of two arguments and also a list or vector vec, which is to be reduced using the function f. The function f is rbind method, which is used to bind the rows together to form a dataframe. The by() method in R is used to apply a function to specified subsets of a dataframe. The first argument of this method takes up the data and second parameter is by which the function is applied and third parameter is the function. Here, the head is used as the function specified using the third argument of the method call. It is used to specify the n rows group wise from the dataframe.
Syntax: by(df, df$col-name, FUN)
Arguments :
- df – The dataframe to apply the functions on
- FUN – The function to be applied
The combined function application can be summarized as follows :
Reduce(rbind,by())
Code:
R
# creating dataframe data_frame <- data.frame (col1 = rep ( letters [1:4], each = 5), col2 = 1:20, col3 = 20:39) print ( "Original DataFrame" ) print (data_frame) # sorting the data by the column # required in descending order data_sorted <- data_frame[ order (data_frame$col2, decreasing = TRUE ), ] # select top 3 values from each group data_mod <- Reduce (rbind, by (data_sorted, data_sorted[ "col1" ], head, n = 3)) print ( "Modified DataFrame" ) print (data_mod) |
Output:
Select Top N Highest Values by Group in R
In this article, we are going to see how to select the Top Nth highest value by the group in R language.