java.nio.charset.CharsetEncoder Class in Java

java.nio.charset.Charset Class in Java

For the purpose of character encoding and decoding, java offers a number of classes in the ‘java.nio.charset’ package. The ‘CharsetEncoder’ class of this package performs the important task of encoding. In this article, let us understand this class, its syntax, different methods, and some examples of error handling and optimization techniques.

What is a CharsetEncoder?

The ‘CharsetEncoder’ class is imported from ‘java.nio.charset’ package.

The basic function of the class is to use a certain character set or an encoding known as a Charset. It converts the character sequences into byte format. This class is commonly used for activities such as writing textual data to files, transmitting data over the network, and encoding/decoding data between different character encodings.

CharsetEncoder translates a character input to a byte output. The internal character representation of Java which is usually UTF-16, is encoded and converted into the byte representation of the chosen character encoding (eg. UTF-8, etc).

Syntax of CharsetEncoder

public abstract class CharsetEncoder extends Object

Constructors of CharsetEncoder

Constructor associated with CharsetEncoder and its description.

Constructor	Modifier	Description
CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar)	protected	A new encoder for a given Charset is initialized with the maximum and average bytes per character specified by the CharsetEncoder constructor.
CharsetEncoder(Charset cs, float averageBytesPerChar, float maxBytesPerChar, byte[] replacement)	protected	A new encoder for a given Charset is initialized by the CharsetEncoder constructor with an estimated average and maximum number of bytes per character as well as a unique alternative byte sequence for characters that cannot be mapped.

Methods of CharsetEncoder

Table of the methods associated with CharsetEncoder and its description.

Modifier and Type	Method	Description
float	averageBytesPerChar()	Returns the average number of bytes that will be generated for every input character.
boolean	canEncode(char c)	Indicates if the specified character can be encoded by this encoder.
boolean	canEncode(CharSequence cs)	Indicates if the provided character sequence can be encoded by this encoder.
Charset	charset()	Returns the charset that created this encoder.
ByteBuffer	encode(CharBuffer in)	Encodes the remaining data from a single input character buffer into a newly-allocated byte buffer
CoderResult	encode(CharBuffer in, ByteBuffer out, boolean endOfInput)	Writes the results to the specified output buffer after encoding as many characters as possible from the provided input buffer.
protected abstract CoderResult	encodeLoop(CharBuffer in, ByteBuffer out)	Encodes one or more characters into one or more bytes.
CoderResult	flush(ByteBuffer out)	Flushes the encoder.
protected CoderResult	implFlush(ByteBuffer out)	Flushes the encoder.
protected void	implReset()	Clears any internal state specific to a given charset by resetting this encoder.
boolean	isLegalReplacement(byte[] repl)	Indicates if the provided byte array is a valid replacement value for this encoder.
float	maxBytesPerChar()	Returns the maximum number of bytes that can be generated for each input character.
CharsetEncoder	reset()	Resets the encoder, clearing any internal state.
byte[]	replacement()	Returns the replacement value for this encoder.
CharsetEncoder	replaceWith(byte[] newReplacement)	Modifies the replacement value of this encoder.

Inherited Methods

The Methods included with Charset class are inherited by java.lang.Object .

Examples of CharEncoder Class

Example 1: Basic use of CharsetEncoder

In this example, the input string is encoded into bytes using the CharsetEncoder with UTF-8 character encoding.

It covers on how to construct a CharsetEncoder, encode the characters, place the input text within a CharBuffer, then output the data that has been encoded. It has basic error handling to address any issues that may come up during the encoding process.

Java

// Java Program to construct a  
// CharsetEncoder using CharBuffer 
import java.nio.*; 
import java.nio.charset.*; 
  
//Driver class 
public class Main { 
      
      // Main method 
      public static void main(String[] args){ 
  
        // Create a Charset 
        Charset ch = Charset.forName("UTF-8"); 
  
        // Initialize a CharsetEncoder 
        CharsetEncoder ec = ch.newEncoder(); 
  
        // Input string 
        String str = "CharsetEncoder Example"; 
  
        // Wrap the input text in a CharBuffer 
        CharBuffer charBuffer = CharBuffer.wrap(str); 
  
        try { 
            // Encode the characters 
            ByteBuffer bf = ec.encode(charBuffer); 
  
            // Print the encoded data 
            String ans = new String(bf.array()); 
            System.out.println(ans); 
        } 
        catch (Exception e) { 
            // Handle the exception 
            e.printStackTrace(); 
        } 
    } 
}

Output:

CharsetEncoder Example

Example 2: Error Handling

The UTF-8 character encoding can encode only the characters that lie within the Unicode standard. There are some special characters or symbols that cannot be recognized by this encoding technique. In order to prevent problems, the errors need to be handled using some methods. In the below given example, we have given an input string which contains a special symbol ‘Ω’, that is not mappable using UTF-8. We use the ‘onUnmappableCharacter‘ and ‘CodingErrorAction.REPLACE‘ methods to replace these unmappable characters with any different character.

In the code below, whenever we encounter ‘Ω’, it is replaced by ‘?‘ which indicates that the special symbol is replaced with a fallback character for error handling.

Java

// Java Program for Error handling 
// Using onUnmappableCharacter 
import java.nio.*; 
import java.nio.charset.*; 
  
//Driver Class 
public class Main { 
      
      //Main method 
      public static void main(String[] args){ 
        
        // Create a Charset 
        Charset ch = Charset.forName("UTF-8"); 
  
        // Initialize a CharsetEncoder 
        CharsetEncoder ec = ch.newEncoder(); 
  
        // Input string (with Ω as an unmappable character) 
        String str = "Charset Ω Encoder"; 
  
        // Handle the error by replacing the unmappable 
        // character with a question mark 
        ec.onUnmappableCharacter(CodingErrorAction.REPLACE); 
        ec.replaceWith("?".getBytes()); 
  
        // Wrap the string into a CharBuffer 
        CharBuffer cb = CharBuffer.wrap(str); 
  
        try { 
            // Encode the characters 
            ByteBuffer bf = ec.encode(cb); 
  
            // Convert the ByteBuffer to a String 
            String ans = new String(bf.array()); 
            System.out.println("Encoded String: " + ans); 
        } 
        catch (Exception e) { 
            // Handle the exception 
            System.err.println("Error: " + e.getMessage()); 
        } 
        
    } 
}

Output:

Encoded String: Charset ? Encoder

How to Optimize the Encoding?

Now that we have understood about the encoding operations with the help of CharsetEncoder class, it is important to know about how to improve the efficiency and performance when dealing with larger volumes of data.

Buffer Management: Using CharBuffer and ByteBuffer, we can manage the size of data as it avoid frequent reallocations. Set aside buffers that are just sufficient to contain expected data. We have discussed this method in the examples given above

Reuse Buffers: Instead of creating new instances of CharBuffer and ByteBuffer everytime, consider reusing them for each encoding and decoding operations. This will significantly reduce the memory allocation.

Bulk Encoding: Always use the encode() method with CharSequence or a CharBuffer that contains all the characters to be encoded or decoded. Using this, the number of encoding calls will be minimized making your program efficient.

Precompute Buffer Size: To prevent unnecessary resizing, allocate the ByteBuffer with the right size or a little bit more capacity if you know the approximate amount of the encoded data in bytes.

In this article, we covered all the methods and best practices related to the CharsetEncoder class. From syntax, constructors to error handling and optimization techniques, we explored how to utilize this class for character encoding tasks in Java applications.

Tags:

#Geeks Premier League 2023 #Java-Classes #Geeks Premier League #Java #Java

java.nio.charset.Charset Class in Java

java.nio.charset.CharsetEncoder Class in Java

What is a CharsetEncoder?

Syntax of CharsetEncoder

Constructors of CharsetEncoder

Methods of CharsetEncoder

Inherited Methods

Examples of CharEncoder Class

Example 1: Basic use of CharsetEncoder

Java

Output:

Example 2: Error Handling

Java

Output:

How to Optimize the Encoding?

Contact Us