Creating a PRAW Instance
In order to connect to Reddit, we need to create a praw instance. There are 2 types of praw instances:
- Read-only Instance: Using read-only instances, we can only scrape publicly available information on Reddit. For example, retrieving the top 5 posts from a particular subreddit.
- Authorized Instance: Using an authorized instance, you can do everything you do with your Reddit account. Actions like upvote, post, comment, etc., can be performed.
Python3
# Read-only instance reddit_read_only = praw.Reddit(client_id = "", # your client id client_secret = "", # your client secret user_agent = "") # your user agent # Authorized instance reddit_authorized = praw.Reddit(client_id = "", # your client id client_secret = "", # your client secret user_agent = "", # your user agent username = "", # your reddit username password = "") # your reddit password |
Now that we have created an instance, we can use Reddit’s API to extract data. In this tutorial, we will be only using the read-only instance.
Scraping Reddit using Python
In this article, we are going to see how to scrape Reddit using Python, here we will be using python’s PRAW (Python Reddit API Wrapper) module to scrape the data. Praw is an acronym Python Reddit API wrapper, it allows Reddit API through Python scripts.