Read Large JSON files in R using read_json()

read_json is a function from the jsonlite package that allows you to read JSON files in a memory-efficient way. It reads the file line by line, so it only loads a small portion of the data into memory at a time. This makes it a great choice for reading large JSON files.

Install the jsonlite library and load it

To read a large JSON file in R, one of the most popular packages is jsonlite. This package provides a simple and efficient way to parse JSON data and convert it into an R object. To install jsonlite, you can use the following command:

install.packages("jsonlite")
library(jsonlite)

Creating Random Dataset

Here we are creating our own dataset, you can create your own or you can use any JSON large dataset from any site.

R




library(jsonlite)
 
# generate random id
generate_id <- function() paste0(sample(c(letters,
                                LETTERS, 0:9), 10,
                                replace=TRUE),
                                 collapse="")
 
# real first names of people
first_names <- c("John", "Jane", "Michael",
                 "Emily", "William", "Ashley",
                 "David", "Jessica", "Andrew",
                 "Jennifer",
                 "Matthew", "Sarah", "Daniel",
                 "Amanda", "Christopher", "Elizabeth",
                 "Nicholas", "Megan", "Robert",
                 "Lauren", "Joseph", "Ava", "Jacob",
                 "Sophia", "Jonathan", "Natalie", "Ryan",
                 "Madison", "Adam", "Chloe")
 
# real last names of people
last_names <- c("Smith", "Johnson", "Williams", "Jones",
                "Brown", "Davis", "Miller", "Wilson",
                "Moore", "Taylor",
                "Anderson", "Thomas", "Jackson", "White",
                "Harris", "Martin", "Thompson", "Garcia",
                "Martinez", "Robinson",
                "Clark", "Rodriguez", "Lewis", "Lee", "Walker",
                "Hall", "Allen", "King", "Wright", "Scott")
 
# education qualifications
qualifications <- c("Primary Education", "Secondary Education",
                    "High School", "Undergraduate", "Postgraduate")
 
# create a data frame
df <- data.frame(ID = sapply(1:1000000,
                       function(i) generate_id()),
                 First_Name = sample(first_names,
                            1000000, replace = TRUE),
                 Last_Name = sample(last_names,
                             1000000, replace = TRUE),
                 Age = sample(18:30, 1000000,
                              replace = TRUE),
                 Highest_qualification =
                 sample(qualifications, 1000000,
                        replace = TRUE),
                 stringsAsFactors = FALSE)
 
# write the data frame to a JSON file
write_json(df, "people.json")


You can check the size of the file using the following code. 

R




file.info("people.json")$size


Output:

113428352

Read the JSON file into R

The read_json() function will automatically detect the data structure of the JSON file and convert it into an R object, which can be a list or a data frame. Once you have the data in an R object, you can use all the standard R functions and packages to manipulate and analyze it.

You can use the read_json() function to read a JSON file into R. For example, to read a JSON file called “data.json” in your working directory, you would use the following code:

R




data <- jsonlite::read_json("file.json")
head(data, 3)


Output:

How to Read Large JSON file in R

First, it is important to understand that JSON (JavaScript Object Notation), is a lightweight data-interchange format that is easy for humans to read and write and easy for machines to parse and generate. JSON files are often used for data transmission between a server and a web application and can be quite large in size.

In this article, we’ll cover the basics of using read_json and split to read large JSON files in R. We’ll also explore some advanced techniques for optimizing performance and reducing memory usage. Whether you’re a seasoned R programmer or a beginner, this article will provide you with the knowledge and skills you need to read large JSON files in R with confidence.

Similar Reads

Read Large JSON files in R using read_json()

read_json is a function from the jsonlite package that allows you to read JSON files in a memory-efficient way. It reads the file line by line, so it only loads a small portion of the data into memory at a time. This makes it a great choice for reading large JSON files....

Split Large JSON files in R using Split

...