How to Use Equivalent of SAS PROC FREQ in R

In R Programming Language, we use SAS PROC FREQ for frequency analysis, providing counts and percentages of unique values in a dataset. But we can achieve similar functionality by using different functions and packages. In this article, we’ll explore different methods to

How to Use Equivalent of SAS PROC FREQ in R

Below are some ways we can replicate the functionality of SAS PROC FREQ in R.

  1. Using table() function
  2. Using dplyr package
  3. Using data.table package

Method 1:Using table() function

In this example, we use a table() function to create a frequency table, counting the occurrences of each unique value in a vector. It is a simple and quick way to obtain frequency counts for categorical data.

R
# Sample data
data <- c("A", "B", "A", "C", "A", "B", "A", "C", "D")

# Using table() function to get frequency counts
freq_table <- table(data)
print(freq_table)

Output:

data
A B C D
4 2 2 1

Method 2:Using dplyr package

In this we use dplyr package by which we can group data by a specific variable and then count the occurrences of each group.

R
# Load the dplyr package
library(dplyr)

# Sample data frame
df <- data.frame(
  ID = c(101, 102, 103, 101, 104),
  Gender = c("Male", "Female", "Male", "Male", "Female")
)

# Using dplyr package to count occurrences of each gender
df_freq <- df %>% group_by(Gender) %>% summarise(count = n())
print(df_freq)

Output:

A tibble: 2 × 2
Gender count
<chr> <int>
1 Female 2
2 Male 3

Method 3:Using data.table package

In this we use data.table package to specify the .N special symbol so that we can count occurrences of each unique value in a data table, similar to the table() function but optimized for larger datasets.

R
# Load the data.table package
library(data.table)

# Sample data
scores <- c(85, 92, 78, 85, 90, 78, 92, 85, 78, 90)

# Convert data to data.table
dt <- data.table(scores)

# Using data.table package to count occurrences of each score
dt_freq <- dt[, .N, by = .(scores)]
print(dt_freq)

Output:

   scores N
1: 85 3
2: 92 2
3: 78 3
4: 90 2

Conclusion

In R, replicating the functionality of SAS PROC FREQ can be achieved through various methods and packages. Whether it’s using base R functions like table() and summary(), or leveraging packages like dplyr and data.table, R offers flexibility and efficiency in conducting frequency analysis tasks. By mastering these methods, data analysts can efficiently explore the distribution of data and derive meaningful insights from their datasets.