Python program to extract Email-id from URL text file
Prerequisite : Pattern Matching with Python Regex
Given the URL text-file, the task is to extract all the email-ids from that text file and print the urllib.request
library can be used to handle all the URL related work.
Example :
Input : Hello This is w3wiki review-team@w3wiki.net review-team@w3wiki.net GfG is a portal for Beginner feedback@w3wiki.net careers@w3wiki.net Output : [] [] ['review-team@w3wiki.net'] ['review-team@w3wiki.net'] [] ['feedback@w3wiki.net'] ['careers@w3wiki.net']
URL text file can be handled using urllib.request
. For extracting the emails using regular expressions, re
library can be used. For more details of Regular Expression, refer this.
# library that handles the URL stuff import urllib.request # Importing module required for # regular expressions import re # Assign urlopen to a file object variable fhand = urllib.request.urlopen ( 'https://media.w3wiki.net/wp-content/uploads/e-mail-1.txt' ) for line in fhand: # Getting the text file # content line by line. s = line.decode().strip() # regex for extracting all email-ids # from the text file reg = re.findall(r "[A-Za-z0-9._%+-]+" r "@[A-Za-z0-9.-]+" r "\.[A-Za-z]{2,4}" , s) # printing the list output print (reg) |
Output :
[] [] ['review-team@w3wiki.net'] ['review-team@w3wiki.net'] [] ['feedback@w3wiki.net'] ['careers@w3wiki.net']