Extract title from a webpage using Python

In this article, we are going to write python scripts to extract the title form the webpage from the given webpage URL.

Method 1: bs4 Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in with Python. To install this type the below command in the terminal.

pip install bs4

requests module allows you to send HTTP/1.1 requests extremely easily. This module also does not come built-in with Python. To install this type the below command in the terminal.

pip install requests

Approach:

  • Import the modules
  • Make requests instance and pass into URL
  • Pass the requests into a Beautifulsoup() function
  • Use the ‘title’ tag to find them all tag (‘title’)

Code:

Python3




# importing the modules
import requests
from bs4 import BeautifulSoup
 
# target url
 
# making requests instance
reqs = requests.get(url)
 
# using the BeautifulSoup module
soup = BeautifulSoup(reqs.text, 'html.parser')
 
# displaying the title
print("Title of the website is : ")
for title in soup.find_all('title'):
    print(title.get_text())


Output:

Title of the website is : 
w3wiki | A computer science portal for Beginner

Methods 2: In this method, we will use urllib and Beautifulsoup modules to extract the title of the website. urllib is a package that allows you to access the webpage with the program.

Installation:

pip install urllib

Approach:

  • Import module
  • Read the URL with the request.urlopen(URL).
  • Find the title with soup.title from the HTML document

Implementation:

Python3




# importing the modules
from urllib.request import urlopen
from bs4 import BeautifulSoup
 
# target url
 
# using the BeautifulSoup module
soup = BeautifulSoup(urlopen(url))
 
# displaying the title
print("Title of the website is : ")
print (soup.title.get_text())


Output:

Title of the website is : 
w3wiki | A computer science portal for Beginner

Method 3: In this method, we will use the mechanize module. It is stateful programmatic web browsing in Python. Browse pages programmatically with easy HTML form filling and clicking of links.

Installation:

pip install mechanize

Approach:

  • Import module.
  • Initialize the Browser() instance.
  • Retrieves the webpage content Browser.open().
  • Display the title with Browser.title()

Implementation:

Python3




# importing the module
from mechanize import Browser
 
# target url
 
# creating a Browser instance
br = Browser()
br.open(url)
 
# displaying the title
print("Title of the website is : ")
print( br.title())


Output:

Title of the website is : 
w3wiki | A computer science portal for Beginner