Python | Split URL from Query Parameters

Sometimes, while web development, we can come across a task in which we may require to perform a split of query parameters from URLs which is done by β€˜?’ character. This has application over web development as well as other domains which involve URLs. Lets discuss certain ways in which this task can be performed.

Method #1 : Using split() 
This is one of the way in which we can solve this problem. We split by β€˜?’ and return the first part of split for result.
 

Python3




# Python3 code to demonstrate working of
# Split URL from Query Parameters
# Using split()
 
# initializing string
test_str = 'www.w3wiki.net?is = best'
 
# printing original string
print("The original string is : " + str(test_str))
 
# Split URL from Query Parameters
# Using split()
res = test_str.split('?')[0]
 
# printing result
print("The base URL is : " + res)


Output : 

The original string is : www.w3wiki.net?is=best
The base URL is : www.w3wiki.net

 

Time Complexity: O(n) -> (split function)

Auxiliary Space: O(n)

 
Method #2 : Using rfind() 
This is another way in which we need to perform this task. In this, we find the first occurrence of β€˜?’ from right and slice the string.
 

Python3




# Python3 code to demonstrate working of
# Split URL from Query Parameters
# Using rfind()
 
# initializing string
test_str = 'www.w3wiki.net?is = best'
 
# printing original string
print("The original string is : " + str(test_str))
 
# Split URL from Query Parameters
# Using rfind()
res = test_str[:test_str.rfind('?')]
 
# printing result
print("The base URL is : " + res)


Output : 

The original string is : www.w3wiki.net?is=best
The base URL is : www.w3wiki.net

 

Time Complexity: O(n)

Auxiliary Space : O(n)

Method #3 : Using index().Finding index of β€˜?’ and then used string slicing

Python3




# Python3 code to demonstrate working of
# Split URL from Query Parameters
# Using index()
 
# initializing string
test_str = 'www.w3wiki.net?is = best'
 
# printing original string
print("The original string is : " + str(test_str))
 
# Split URL from Query Parameters
# Using index()
res = test_str[0:test_str.index('?')]
 
# printing result
print("The base URL is : " + res)


Output

The original string is : www.w3wiki.net?is = best
The base URL is : www.w3wiki.net

Time Complexity: O(n)

Auxiliary Space: O(n)

Method #4 : Using operator.getitem(),index() methods

Approach 

  1. Found index of ? using index() method
  2. Used operator.getitem(),slice() to extract the sliced string from start(0) to the index of ? and assigned to res variable
  3. Displayed the res variable

Python3




# Python3 code to demonstrate working of
# Split URL from Query Parameters
# Using index()
 
# initializing string
test_str = 'www.w3wiki.net?is = best'
 
# printing original string
print("The original string is : " + str(test_str))
 
# Split URL from Query Parameters
# Using index()
import operator
res = operator.getitem(test_str,slice(0, test_str.index('?')))
 
# printing result
print("The base URL is : " + res)


Output

The original string is : www.w3wiki.net?is = best
The base URL is : www.w3wiki.net

Time Complexity: O(n)

Auxiliary Space: O(n)

Method #5 : Using urlparse function:

1.Import the urlparse function from the urllib.parse module.
2.Define the input URL string as test_str.
3.Use the urlparse function to parse the test_str URL into a ParseResult object.
4.Use the _replace method to create a new ParseResult object with the query parameter set to None.
5.Use the geturl method to generate a new URL string from the modified ParseResult object.
6.Print the new URL string to the console.

Python3




# Importing the urlparse function from the urllib.parse module
from urllib.parse import urlparse
 
# Defining the input URL string
test_str = 'http://www.w3wiki.net?is=best'
 
# Using the urlparse function to parse the input URL into its component parts
parsed_url = urlparse(test_str)
# printing original string
print("The original string is : " + str(test_str))
 
# Using the _replace method to create a new parsed URL object with the query parameter set to None
# This effectively removes the query parameter from the URL
new_parsed_url = parsed_url._replace(query=None)
 
# Using the geturl method to generate a new URL string from the modified parsed URL object
new_url_str = new_parsed_url.geturl()
 
# Printing the new URL string
print(new_url_str)


Output

The original string is : http://www.w3wiki.net?is=best
http://www.w3wiki.net

Time complexity:

Parsing the URL using the urlparse function has a time complexity of O(n), where n is the length of the input string.
Using the _replace method has a time complexity of O(1), as it simply creates a new ParseResult object with a modified query parameter.
Using the geturl method has a time complexity of O(n), where n is the length of the output URL string.
Overall, the time complexity of this code is O(n), where n is the length of the input and output strings.

Auxiliary Space:

The space complexity of this code is O(n), where n is the length of the input and output strings.
This is because the urlparse function creates a new ParseResult object that stores the various components of the URL (such as the scheme, netloc, path, query, and fragment).
The _replace method creates a new ParseResult object with a modified query parameter, and the geturl method generates a new URL string from this object.
Thus, the amount of space required by this code is proportional to the length of the input and output strings.

Method #6: Using re.split()

  • Import the re module.
  • Define a regular expression pattern to match the query parameters section of the URL (the part after the ? character).
  • Use the re.split() function to split the URL using the regular expression pattern.
  • The first element of the resulting list will be the base URL.

Python3




# Python3 code to demonstrate working of
# Split URL from Query Parameters
# Using re.split()
 
# import re module
import re
 
# initializing string
test_str = 'www.w3wiki.net?is = best'
 
# printing original string
print("The original string is : " + str(test_str))
 
# Split URL from Query Parameters
# Using re.split()
pattern = r'\?'  # regular expression pattern to match the query parameters section
res = re.split(pattern, test_str)[0]
 
# printing result
print("The base URL is : " + res)


Output

The original string is : www.w3wiki.net?is = best
The base URL is : www.w3wiki.net

Time complexity: The time complexity of this method is O(n), where n is the length of the input string.
Auxiliary space: The space complexity of this method is O(n), where n is the length of the input string.