Output StreamWriter and Input StreamReader

The concrete OutputStreamWriter class (a Writer subclass) is a bridge between an incoming sequence of characters and an outgoing stream of bytes. Characters written to this writer are encoded into bytes according to the default or specified character encoding.

NOTE: The default character encoding is accessible via the file.encoding system property.

Each call to an OutputStreamWriter write() method causes an encoder to be called on the given character(s). The resulting bytes are accumulated in a buffer before being written to the underlying output stream. The characters passed to the write() methods are not buffered.

OutputStreamWriter declares four constructors, including the following:

■ OutputStreamWriter(OutputStream out) creates a bridge between an incoming sequence of characters (passed to OutputStreamWriter via its append() and write() methods) and underlying output stream out. The default character encoding is used to encode characters into bytes.

■ OutputStreamWriter(OutputStream out, String charsetName) creates a bridge between an incoming sequence of characters (passed to OutputStreamWriter via its append() and write() methods) and underlying output stream out. charsetName identifies the character encoding used to encode characters into bytes. This constructor throws UnsupportedEncodingException when the named character encoding is not supported.

NOTE: OutputStreamWriter depends on the abstract java.nio.charset.Charset and java.nio.charset.CharsetEncoder classes to perform character encoding.

Listing 10-25 uses the second constructor to create a bridge to an underlying file output stream so that Polish text can be written to an ISO/IEC 8859-2-encoded file.

Listing 10-25. Outputting Polish text

FileOutputStream fos = new FileOutputStream("polish.txt"); OutputStreamWriter osw = new OutputStreamWriter(fos, "8859_2");

The concrete InputStreamReader class (a Reader subclass) is a bridge between an incoming stream of bytes and an outgoing sequence of characters. Characters read from this reader are decoded from bytes according to the default or specified character encoding.

Each call to an InputStreamReader read() method may cause one or more bytes to be read from the underlying input stream. To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation.

InputStreamReader declares four constructors, including the following:

■ InputStreamReader(InputStream in) creates a bridge between underlying input stream in and an outgoing sequence of characters (returned from InputStreamReader via its read() methods). The default character encoding is used to decode bytes into characters.

■ InputStreamReader(InputStream in, String charsetName) creates a bridge between underlying input stream in and an outgoing sequence of characters (returned from InputStreamReader via its read() methods). charsetName identifies the character encoding used to decode bytes into characters. This constructor throws UnsupportedEncodingException when the named character encoding is not supported.

NOTE: InputStreamReader depends on the abstract Charset and java.nio.charset.CharsetDecoder classes to perform character decoding.

Listing 10-26 uses the second constructor to create a bridge to an underlying file input stream so that Polish text can be read from an ISO/IEC 8859-2-encoded file.

Listing 10-26. Inputting Polish text

FileInputStream fis = new FileInputStream("polish.txt"); InputStreamReader isr = new InputStreamReader(fis, "8859_2"); char ch = isr.read(ch);

NOTE: OutputStreamWriter and InputStreamReader declare a String getEncoding() method that returns the name of the character encoding in use. If the encoding has a historical name, that name is returned; otherwise, the encoding's canonical name is returned.

You may not be aware of all the character encodings supported by your Java virtual machine. However, you can use the Charset class to find out. Listing 10-27 presents a DumpEncodings application that shows you how to accomplish this task.

Listing 10-27. Dumping the default and all supported character encodings to standard output import java.nio.charset.Charset;

import java.util.Iterator; import java.util.Set; import java.util.SortedMap;

public class DumpEncodings {

public static void main(String[] args) {

System.out.println("Default file encoding = " +

System.getProperty("file.encoding")); SortedMap<String, Charset> map = Charset.availableCharsets(); Set<String> keys = map.keySet();

System.out.println("==============================================" +

System.out.printf("%-20s %-20s %-5s%n", "Canonical name",

"Display name", "Encode?"); System.out.println("==============================================" +

Iterator<String> iter = keys.iterator();

String canonicalName = iter.next(); Charset charset = map.get(canonicalName); String displayName = charset.displayName(); boolean canEncode = charset.canEncode(); System.out.printf("%-20s %-20s %-5b%n", canonicalName, displayName, canEncode); Set<String> aliases = charset.aliases(); Iterator<String> iter2 = aliases.iterator(); System.out.println("ALIASES"); while (iter2.hasNext())

System.out.println("----------------------------------------------" +

After outputting file.encoding's value, main() obtains a sorted map from canonical (standard) charset names to Charset objects by calling Charset's static SortedMap<String,Charset> availableCharsets() method.

NOTE: An instance of a concrete Charset subclass is an implementation of a character encoding and is often referred to as a charset. In addition to providing methods that return useful information about the charset, the instance provides methods to obtain an encoder and a decoder associated with the charset.

main() next calls the sorted map's keySet() method to return a set of canonical charset name keys. After calling this set's iterator() method to return an Iterator instance for looping over the set of names, main() performs this iteration.

For each iteration, main() uses the returned canonical name to obtain its associated Charset object from the map. It then calls Charset's String displayName() method on this object to return this charset's human-readable name for the default locale.

NOTE: The intent of displayName() is to provide a localized version of the character encoding name for display to the user. The default (Charset) implementation of this method returns the nonlocalized canonical name, which is also returned from Charset's String name() method.

After outputting the display name, main() calls Charset's boolean canEncode() method to find out if this charset supports encoding. Most character sets can be encoded, and this method returns true. However, this method returns false for auto-detect charsets.

NOTE: An auto-detect charset is a charset whose decoder can determine which of several possible encoding schemes is in use by examining the input byte sequence. Such a charset does not support encoding because there is no way to determine which encoding should be used on output.

After outputting canEncode()'s value, main() calls Charset's Set<String> aliases() method to return a nonnull (but possibly empty) set of strings that serve as aliases for the canonical name. It then iterates over this set, outputting each alias.

When I run this application on my XP platform, it generates the following output (which I have abbreviated for brevity):

Default file encoding = Cp1252

Canonical name

Display name

Encode?

Big5

Big5

true

ALIASES

- csBig5

Big5-HKSCS Big5-HKSCS true

ALIASES

- big5-hkscs:unicode3.0

- Big5_HKSCS

- big5-hkscs

Big5-HKSCS Big5-HKSCS true

ALIASES

- big5-hkscs:unicode3.0

- Big5_HKSCS

- big5-hkscs

- big5hkscs

- big5hk

EUC-JP EUC-JP true

ALIASES

- eucjis

- Extended_UNIX_Code_Packed_Format_for_Japanese

- eucjp

- csEUCPkdFmtjapanese

NOTE: You can pass a charset's canonical name or alias to the aforementioned OutputStreamWriter or InputStreamReader constructors that present charsetName parameters.

Was this article helpful?

0 0

Responses

  • delmo padovesi
    How to take output in java using streamreader?
    4 months ago

Post a comment