Fetching Data from API using urllib Library

Imports the pandas, urllib.request, and json libraries. Initializes an empty pandas DataFrame called df. Uses a for loop to loop through pages 1 to 399 of the TMDb API’s top rated movies endpoint. For each iteration of the loop, the code constructs a URL that specifies the API key and language, and the page number to retrieve. The code then sends a GET request to the URL using urllib.request.urlopen(), and reads the response into a variable response. The json library is used to parse the response into a dictionary called data. The code then creates a temporary DataFrame temp_df from a subset of the data obtained from the API, specifically the ‘results’ key in the data dictionary. The subset includes the columns ‘id’, ‘title’, ‘overview’, ‘release_date’, ‘popularity’, ‘vote_average’, and ‘vote_count’. The temporary DataFrame is then appended to the final DataFrame df using df.append(). After the for loop is completed, the code prints the shape of the final DataFrame df, the first five rows of the DataFrame, and then saves the DataFrame as a CSV file. Finally, the code uses the files.download() function to download the CSV file to the local machine.

Note: The API key used in this code is an example and might not work. To use this code, you will need to obtain a valid API key from TMDb and use that in the URL.

Python3




# Importing required libraries
from google.colab import files
import pandas as pd
import urllib.request
import json
 
# Creating an empty DataFrame to store movie data
df = pd.DataFrame()
 
# Looping through pages of movie data
for i in range(1, 400):
    # Constructing the API url with page number
    url = 'https://api.themoviedb.org/3/movie/\
    top_rated?api_key=aaa7de53dcab3a19afed86880f\
    364e54&language=en-US&page={}'.format(i)
    # Making a request to the API
    response = urllib.request.urlopen(url)
    # Loading the API response into a dictionary
    data = json.loads(response.read().decode())
    # Creating a DataFrame from the 'results' key in the API response
    temp_df = pd.DataFrame(data['results'])[
        ['id', 'title', 'overview', 'release_date',
         'popularity', 'vote_average', 'vote_count']]
    # Appending the temporary DataFrame to the main DataFrame
    df = df.append(temp_df, ignore_index=True)
 
# Printing the shape of the final DataFrame
print(df.shape)
# Printing the first five rows of the final DataFrame
print(df.head(5))
# Saving the final DataFrame as a CSV file
df.to_csv('movie_example2.csv', index=False)
# Downloading the final CSV file to the local machine
files.download('movie_example2.csv')


Output: 

 



Save API data into CSV format using Python

In this article, we are going to see how can we fetch data from API and make a CSV file of it, and then we can perform various stuff on it like applying machine learning model data analysis, etc. Sometimes we want to fetch data from our Database Api and train our machine learning model and it was very real-time by applying this method we can train our machine learning model using updated data, so our model’s predictions are accurate. Here we used the requests library in Python to fetch data from our API. 

Similar Reads

Fetching Data from API using Request Library

Step 1: Importing necessary libraries...

Fetching Data from API using urllib Library

...