How to usegroup_by() and summarise() method in R Language

The data transformation in R group_by() and summarise() methods are used collectively to group by variables of the data frame and reduce multiple values down to a single value. It is used to make the data more readable. The column name can be specified in R’s group_by() method. The data can be arranged in groups and then further summarised using the base aggregate methods in this package. 

Syntax: group_by(col-name) 

Syntax: group_by(col,..) %>% summarise(action)

The data in the data frame are grouped according to the col3 value. The count column indicates the number of records in each group; for instance, there are five rows with col3 = 0. The mean is then calculated for all the elements in a. particular group.

R




# Importing dplyr
library(dplyr)
 
# Creating a data frame
data_frame = data.frame(
  col1 = c(2,4,1,7,5,3,5,8),
  col2 = letters[1:8],
  col3 = c(0,1,1,1,0,0,0,0),
  col4 = c(9:16))
 
print("Data Frame")
print(data_frame)
 
# Mutate data using group_by()
# and summarise()
data_frame_mutate <- data_frame %>%
    group_by(col3) %>%
  summarise(
    count = n(),
    mean_col1 = mean(col1)
  )
print("Mutated Data Frame")
print(data_frame_mutate)


Output:

 col1 col2 col3 col4
1    2    a    0    9
2    4    b    1   10
3    1    c    1   11
4    7    d    1   12
5    5    e    0   13
6    3    f    0   14
7    5    g    0   15
8    8    h    0   16

[1] "Mutated Data Frame"
# A tibble: 2 x 3
   col3 count mean_col1
  <dbl> <int>     <dbl>
1     0     5       4.6
2     1     3       4  

How to Transform Data in R?

In this article, we will learn how to transform data in the R programming language.

Similar Reads

Data Transformation in R

The data transformation in R is mostly handled by the external packages tidyverse and dplyr . These packages provide many methods to carry out the data simulations. There are a large number of ways to simulate data transformation in R. These methods are widely available using these packages, which can be downloaded and installed using the following command :...

Method 1: Using Arrange() method

For data transformation in R, we will use The arrange() method, to create an order for the sequence of the observations given. It takes a single column or a set of columns as the input to the method and creates an order for these....

Method 2: Using select() method

...

Method 3: Using filter() method

...

Method 4: Using spread() method

Data transformation in R of the data frame can also be fetched using the select() method in tidyverse package. The columns are fetched in the order of their specification in the argument list of the select() method call. This method results in a subset of the data frame as the output. The following syntax is followed :...

Method 5: Using mutate() method

...

Method 6: Using group_by() and summarise() method

...

Method 7: Using the gather() method

The filter() method in the tidyverse package is used to apply a range of constraints and conditions to the column values of the data frame in data transformation in R. It filters the data and results in the smaller output returned by the column values satisfying the specified condition. The conditions are specified using the logical operators, and values are validated then.  A data frame can be supplied with the pipe operator and then using the filter condition....