Convert Unicode to Bytes in Python
Unicode, often known as the Universal Character Set, is a standard for text encoding. The primary objective of Unicode is to create a universal character set that can represent text in any language or writing system. Text characters from various writing systems are given distinctive representations by it.
Convert Unicode to Byte in Python
Below, are the ways to convert a Unicode String to a Byte String In Python.
Convert A Unicode to Byte Using encode()
with UTF-8
In this example, the Unicode string “Hello, w3wiki” is encoded into bytes using the UTF-8 encoding with the `encode()` method. The resulting `bytes_representation` is a sequence of bytes representing the UTF-8 encoded version of the original string.
Python3
unicode_string = "Hello, w3wiki" bytes_representation = unicode_string.encode( 'utf-8' ) print (bytes_representation) |
b'Hello, w3wiki'
Unicode To Byte Using encode()
with a Different Encoding
In this example, the Unicode string is encoded into a byte string using the UTF-16 encoding with the `encode()` method. The resulting `byte_string_utf16` contains the UTF-16 encoded representation of the original string, which is then printed to the console.
Python3
unicode_string = "Hello, Noida" byte_string_utf16 = unicode_string.encode( 'utf-16' ) # Displaying the byte string print (byte_string_utf16) |
b'\xff\xfeH\x00e\x00l\x00l\x00o\x00,\x00 \x00N\x00o\x00i\x00d\x00a\x00'
Convert A Unicode String To Byte String Using bytes()
Constructor
In this example, the Unicode string is converted to a byte string using the bytes() constructor with UTF-8 encoding. The resulting `byte_string_bytes` represents the UTF-8 encoded byte sequence of the mixed-language Unicode string.
Python3
unicode_string = "Hello, 你好" byte_string_bytes = bytes(unicode_string, 'utf-8' ) # Displaying the byte string print (byte_string_bytes) |
b'Hello, \xe4\xbd\xa0\xe5\xa5\xbd'
Python Unicode String to Byte String Using str.encode()
Method
In this example, the Unicode string is transformed into a byte string using the str.encode() method with UTF-8 encoding. The resulting `byte_string_str_encode` represents the UTF-8 encoded byte sequence of the mixed-language Unicode string.
Python3
unicode_string = "Hello, Shivang" byte_string_str_encode = str .encode(unicode_string, 'utf-8' ) # Displaying the byte string print (byte_string_str_encode) |
b'Hello, Shivang'