Standard charsets
Standard Charsets |
Description |
---|---|
US-ASCII |
7 bit ASCII characters. Represents the basic English alphabet and some control characters. |
ISO-8859-1 |
ISO Latin Alphabet No. 1 that covers the Latin script and some common symbols. |
UTF-8 |
8-bit UCS Transformation which consists of most of the characters(from different languages). |
UTF-16BE |
16-bit UCS Transformation Format in this characters are encoded using big-endian byte |
UTF-16LE |
16-bit UCS Transformation Format in this characters are encoded using little-endian byte order. |
UTF-16 |
16-bit UCS Transformation Format this is often used for internal text processing. |
Using a charset Encoding a string into sequence of bytes
Encoded String into a sequence of bytes using the given charset, storing the result into a new byte array.
public byte[] getBytes(Charset charset);
Example:
java.nio.charset.Charset charset = java.nio.charset.Charset.forName("ASCII");
byte[] byteArray = "Hi".getBytes(charset);
java.nio.charset.Charset Class in Java
In Java, Charset is a mapping technique used in Java to map the 16-bit Unicode sequence and sequences of bytes. It is also used to encode and decode the string data text into different character encoding. It comes under java.nio.charset.Charset package.
The charset must begin with a number or letter. Every charset can decode and encode. For constructing a map that contains every charset, support is available in JVM(Java Virtual Machine).