How to Use Equivalent of SAS PROC FREQ in R
In R Programming Language, we use SAS PROC FREQ for frequency analysis, providing counts and percentages of unique values in a dataset. But we can achieve similar functionality by using different functions and packages. In this article, we’ll explore different methods to
How to Use Equivalent of SAS PROC FREQ in R
Below are some ways we can replicate the functionality of SAS PROC FREQ in R.
- Using
table()
function - Using
dplyr
package - Using
data.table
package
Method 1:Using table() function
In this example, we use a table()
function to create a frequency table, counting the occurrences of each unique value in a vector. It is a simple and quick way to obtain frequency counts for categorical data.
# Sample data
data <- c("A", "B", "A", "C", "A", "B", "A", "C", "D")
# Using table() function to get frequency counts
freq_table <- table(data)
print(freq_table)
Output:
data
A B C D
4 2 2 1
Method 2:Using dplyr package
In this we use dplyr
package by which we can group data by a specific variable and then count the occurrences of each group.
# Load the dplyr package
library(dplyr)
# Sample data frame
df <- data.frame(
ID = c(101, 102, 103, 101, 104),
Gender = c("Male", "Female", "Male", "Male", "Female")
)
# Using dplyr package to count occurrences of each gender
df_freq <- df %>% group_by(Gender) %>% summarise(count = n())
print(df_freq)
Output:
A tibble: 2 × 2
Gender count
<chr> <int>
1 Female 2
2 Male 3
Method 3:Using data.table package
In this we use data.table package
to specify the .N
special symbol so that we can count occurrences of each unique value in a data table, similar to the table()
function but optimized for larger datasets.
# Load the data.table package
library(data.table)
# Sample data
scores <- c(85, 92, 78, 85, 90, 78, 92, 85, 78, 90)
# Convert data to data.table
dt <- data.table(scores)
# Using data.table package to count occurrences of each score
dt_freq <- dt[, .N, by = .(scores)]
print(dt_freq)
Output:
scores N
1: 85 3
2: 92 2
3: 78 3
4: 90 2
Conclusion
In R, replicating the functionality of SAS PROC FREQ can be achieved through various methods and packages. Whether it’s using base R functions like table()
and summary()
, or leveraging packages like dplyr
and data.table
, R offers flexibility and efficiency in conducting frequency analysis tasks. By mastering these methods, data analysts can efficiently explore the distribution of data and derive meaningful insights from their datasets.