How to use data.table package In R Language

The data.table method in R is used to perform data storage and manipulations in a well organized manner. The package can be downloaded and installed into the working directory using the following command :

install.packages(data.table)

The data table can be re-ordered by group in descending order of their values by the order method. The corresponding dataframe is then accessed using the indexing method by taking the order function’s output in the form of row indexes.

Syntax: order(vec, decreasing = TRUE)

Arguments :

Vec – The dataframe column name to arrange in descending order

Decreasing – The flag to set data in descending order

The dataframe can then be converted into a data table using the data.table() method along with the column name to be used in setKey() method. The key attribute contains the column name to group the data by in the data.table.

data.table(df, key = )

Now, the head along with .SD attribute can be used to access the top n rows of each of the taken groups. The by argument contains the grouping column. The head method takes as arguments .SD and integer value n.

df[ , head(.SD, 3), by =]

Code:

R




library("data.table")
 
# creating dataframe
data_frame <- data.frame(col1 = rep(letters[1:4], each = 5),
                         col2 = 1:20,
                         col3 = 20:39)
print("Original DataFrame")
print(data_frame)
 
# sorting the data in descending order
 
# Top N highest values by group
data_mod <- data_frame[order(data_frame$col2, decreasing = TRUE), ] 
 
# organising the data by group
data_mod <- data.table(data_mod, key = "col1")
 
# getting top2 values
data_mod <- data_mod[ , head(.SD, 2), by = col1]
 
# printing modified dataframe                                      
print("Modified DataFrame")
print(data_mod)


Output:



Select Top N Highest Values by Group in R

In this article, we are going to see how to select the Top Nth highest value by the group in R language.

Similar Reads

Method 1: Using Reduce method

The dataframe can be ordered by group in descending order of their values by the order method. The corresponding dataframe is then accessed using the indexing method by taking the order function’s output in the form of row indexes....

Method 2: Using dplyr package

...

Method 3: Using data.table package

The dplyr package in R is used to perform mutations and data manipulations in R. It is particularly useful for working with dataframes and data tables. The package can be downloaded and installed into the working directory using the following command :...