Python program to extract Email-id from URL text file

Extract text from PDF File using Python

Prerequisite : Pattern Matching with Python Regex
Given the URL text-file, the task is to extract all the email-ids from that text file and print the urllib.request library can be used to handle all the URL related work.

Example :

Input : 
Hello
This is w3wiki
review-team@w3wiki.net
review-team@w3wiki.net
GfG is a portal for Beginner
feedback@w3wiki.net
careers@w3wiki.net

Output :
[]
[]
['review-team@w3wiki.net']
['review-team@w3wiki.net']
[]
['feedback@w3wiki.net']
['careers@w3wiki.net']

URL text file can be handled using urllib.request. For extracting the emails using regular expressions, re library can be used. For more details of Regular Expression, refer this.

# library that handles the URL stuff 
import urllib.request 
  
# Importing module required for 
# regular expressions 
import re 
  
# Assign urlopen to a file object variable 
fhand = urllib.request.urlopen 
    ('https://media.w3wiki.net/wp-content/uploads/e-mail-1.txt') 
  
for line in fhand: 
    # Getting the text file 
    # content line by line. 
    s = line.decode().strip() 
  
    # regex for extracting all email-ids 
    # from the text file 
    reg = re.findall(r"[A-Za-z0-9._%+-]+"
                     r"@[A-Za-z0-9.-]+"
                     r"\.[A-Za-z]{2,4}", s) 
  
    # printing the list output 
    print(reg) 

Output :

[]
[]
['review-team@w3wiki.net']
['review-team@w3wiki.net']
[]
['feedback@w3wiki.net']
['careers@w3wiki.net']

Tags:

#Python Regex-programs #python-regex #Misc #Python #Misc #python

Extract text from PDF File using Python