Introduction to Java I/O

 

  One of modern mankind's most intriguing challenges is to easily explain Java I/O to beginners  !


That's because Java I/O is based on an abstract concept - the concept of streams. You can easily use and apply Java I/O facilities.  But you cannot understand Java I/O without understanding streams.

 

  To start, let us imagine a giant pipe, open at either end.  Just sitting there. A big one. As big, say, as a piece of the Alaska Pipeline! Now, since our piece of pipe is completely open at either end (which of course the actual Alaska pipeline is not), you probably think it would be hard to have the pipe filled with anything.  But that's where the Alaska Pipeline example comes in handy again.  Our pipe is indeed filled - even though it is open at both ends.  Let's say it is filled with cold, sludgy, semi-soldified crude oil!  Our oil is like sludge.  It won't flow by itself.  It just stays stationary inside the pipe even though the two ends of the pipe are open.  Nothing comes out.

 

  To make the oil actually move inside the pipe, you would have to do one of two things.  One is to apply a push or pump at one end.  This will push more oil in.  It will eject oil from the opposite end.  Another approach is to pull some out from the other end, again with a pump. 

 

  If you pump oil in one end, oil is going to come out the other end.  So you'd need a place to put the oil that comes out, right?  You might dump it into a vat, a tank, or a sink (although not an actual kitchen sink for our crude oil!). But let's call it a sink anyway. A sink holds what comes out. 

 

 If you used the approach to pump or pull oil out one end, you'd need a vat or tank or source at the other end to supply additional oil to get sucked in, right? So a source supplies the pipe. A sink receives what comes out.

 

By Jove you've got it!

 

  If you remember the big open pipe filled with non-moving oil, and the need for a pump to make the oil move, you can easily understand Java I/O streams.  Here's how the pipe analogy matches up to Java I/O streams:

 

            Streams and Data:

·         A Java I/O stream is like your pipe.  It doesn't do anything without some help from a pump. Nothing actually moves in it by itself.

·         Data is your sludgy crude oil.  Data fills the pipe.

 

            For Output:

·         An output method, like write( ), is a pump. It pushes data in to the pipe or stream.

·         A source is where the data comes from.

·         A sink is on the end to where the pushing pump pushes the data.  The written data comes out into the sink. When you are writing data, a file in the file system is a good example of a sink.

 

            For Input:

·         An  input method, like read( ), is a pump which sucks data out of the pipe.

·         A source is the end of the pipe from where the sucking pump sucks up more data.  In other words, for input, the source is the origin end.  Again, a file in the file system can also be a good example of a source.

 

  There's one last twist.  Imagine that our pipes can only accept certain kinds of pumps.  Let's say some kinds of pipes accept only pushing pumps. Never any pulling or sucking pumps.  And let's say another kind of pipe accepts only pulling or sucking pumps.  Never any pushing pumps. 

 

For pipes accepting only pushing pumps, which push oil out one end, you'd always need a sink there to catch it  The sink would hold what got pushed out out.  A sink is associated with pushing output, or writing.

 

For pipes accepting only only sucking or pulling pumps, you'd need a source to supply more oil at the beginning end.  A source is associated with pulling in, or reading.

 

  Switching back to Java, there is a direct analogy:

·         Writer and OutputStream stream classes only accept pushing pumps in the form of write(..) methods. If you use them, you have to think about providing a sink to hold what gets pushed out.  write( ) methods are the pushing pumps themselves.

·         Reader and InputStream stream classes only accept sucking or pulling pumps in the form of read(..) methods.  If you use these, you have to think about providing a source to supply what gets pulled in.  read( ) methods are the pulling pumps themselves. 

 

  It's time for a very simple example.  Let's use Java I/O instead of crude oil. 

Suppose you have a file from which you wish to extract data.  Your thoughts would turn to pulling pumps, or read( )  methods, right?  Your source would be the file itself.  Your pipe or stream would be a stream that only accepts pulling pumps. (We'll select the FileReader stream class as our input stream class here - but we'll cover I/O classes themselves shortly.)

The pump for your FileReader stream would be some read( ) method. Your sink would be where the data would go at the other end after it transited the stream. You have two choices there. That data could go into something in your code (like a variable or an array element) or, as we'll see shortly below, it can go directly into another pipe to which you have connected.

 

  In some Java code we will now set up a stream to read a series of integer primitives from a file.  When an incoming integer equals minus one, signifying the end of the source supply file, we will quit. Study the statement comments. They identify your stream, source, sink, and pump. i.e.

 

 import java.io.*;                                    // this import statement always required for I/O operations

  . .

 FileReader fr;                                       // the FileReader object named  fr  is your * stream *

 fr = new FileReader( "myfile" );          // the file named myfile is your * source * -  where data is coming from

 int x;                                                      // the int named x is your * sink * - where the data is going

 while ( true ) {

 x = fr.read( )                                          // the input method named read( ) is your *pump *

 if ( x == -1) break;

 // here you would do something with the contents of x, char by char, as it comes out

 }

 

  Now it's time to talk about connecting pipes and streams.  Have you ever seen those flanges which allow actual pipes of the same diameter to be connected to each other?  Like pipes with these matching connecting flanges, Java I/O streams are built to be connected to each other. 

When you connect two plumbing pipes, their contents easily and automatically run from one pipe into the other.  Similarly, when you connect two Java I/O streams, data easily and automatically runs from one stream into the other. 

 

  If you make such a connection with pipes, what would normally be a sink situation at the (output) end of a pipe becomes something else.  It matches up to the source (input) end of the other pipe to which it is connected.  This makes sense, because what goes out one pipe goes directly into the next one.

 

Now, if you do this, you would not need two pumps any more!. One pump will always do! Located in one of the pipes, it will push or pull contents through both connected pipes.

 

  Let's say there was a plumbers' rule that said, if you connected real pipes like this, the single pump must always go on the left hand pipe.  Well, Java actually has a rule like this. When you connect streams in Java I/O, the rule is that  you always put the pump on the stream at the far left-hand end (That is, at the left hand end of your combined constructor statement). 

 

  And remember we said your methods were your pumps?  Well this means your program will use methods from the stream class located at the farthest left of all the pipes you connect together.  That method will push or pull data through all the associated pipes automaticaly.  We'll see examples of this shortly.

 

  There's more.  Java pipes (streams!), being software themselves, can be intelligent.  They can do things normal plumbing pipes can't. 

When something goes into a normal plumbing pipe, it's not modified along the way.  It comes out as the same "stuff" which went in.  In Java that's not always true.

 

Some (but not all!) Java streams can modify data as it passes through. They still use the same pumping and connection rules which we learned about.  But these streams do something extra to data as it goes through them.  Here are a few simple examples:

 

·         InputStreamReader automatically converts incoming 8-bit ASCII bytes into UTF-8 Unicode characters.

·         PushBackInputStream can be told to stop the flow and re-offer to the read(..) method  the last incoming byte(s) which it previously took in from its source.

·         LineNumber Reader can keep track of how many lines of input are read.

·         BufferedWriter and BufferedOutputStream automatically hold up the outbound flow until they have a buffer's-worth of data to write out to their sinks.

 

Stream classes like this are sometimes called "decorator" classes because they decorate or enhance their data.  You will meet all the decorator classes shortly.

 

  One last point involves the difference between characters and bytes.  Java has entire groups of stream classes dedicated to each.  In the right circumstances, there are advantages to using files with each type of format.  For instance, byte files cannot easily be changed for internationalization, so Unicode characters might be better. On the other hand, Unicode character classes can't depict numeric fields efficiently and compactly, so bytes might be better for numbers used in calculations..

 

Java's character and byte classes are like pipes of different diameters.  Meaning they don't match up and you cannot connect them together.  (Except in one instance.  There are two pairs of transition stream classes which convert bytes to characters and vice versa.  You'll meet them shortly.)

 

The two types of  classes are Reader and Writer classes for handling characters, and InputStream and OutputStream classes for handling bytes.  The basic differentiation is that, if humans are going be reading the data, you should use Reader and Writer classes.  If it's just binary data fields being exchanged between programs, you should use InputStream and OutputStream classes.

 

  One final point is about how you connect the Java streams together when you wish to do so.  You add new pipes to the left, in a Java combined constructor code statement which creates the streams.  And remember, you will use the methods of the stream that ends up located on the far left hand end.

 

 Turning on a pump means using the stream's methods in your Java code. Connecting a pipe to another pipe on its left means not using the right-hand pipe's methods any more. You use the methods from the new pipe - the one on the left.  In Java, this connecting, where the connected pipe (now on the right) just streams its data into the new pipe (on the left) is called "stream chaining." 

 

For instance, if you were to connect a BufferedReader stream to a FileReader stream, here's how the combined constructor statement would look in your Java code.  The classes on the "left" and on the "right" are underlined for you:

 BufferedReader br = new BufferedReader( new FileReader( “myfile” ) );

 

In this example you would then use the methods from the class on the left - which is BufferedReader.  Data would enter the FileReader stream from its source on the far right.  That source is the file named myfile. The data would automatically exit the FileReader stream to the left, streaming directly into the BufferedReader stream which is connected there.  (We'll see this example explained more fully in a constructor statement below!)

 

  So that's it! You are now ready to put it all together!  Let's begin with a short example and a little chart. 


The chart shows several things: two character stream classes for reading characters, the sources and sinks for those streams, plus what kind of pumps (if any) they provide.  It also tells what other streams they can connect to.  Study the chart.  The little arrows indicate where stream chaining can occur, if you visualize a combined constructor statement.

 

Remember, this is for INPUT, so the data will “move” across the Java constructor statement from right to left. 

 

Sends to
what
sink?

CHARACTER STREAM
INPUT
CLASS

Gets from
what
source?

Primary Function
of the Class:

Does it provide any unique methods beyond those in Reader?

 

CHAINABLE SINK:
If chained to a sink stream, it provides a Reader stream of the characters
from the file.  

 

If not chained, its read(..) methods can be used to simply return the characters. 

 

  FileReader

SOURCE NOT CHAINABLE: 
Its source is always a file from the file system.

Reads a file in the file system.

Use the 9 methods from Reader.

 

Adds nothing.

 

CHAINABLE SINK:
If chained to a sink stream, it provides a Reader stream of buffered characters to that stream. 

 

If not chained, its methods can be used return the characters. You may also want to use its readLine( ) method.

 

BufferedReader

CHAINABLE SOURCE: Its source can be any Reader stream.

Provides buffered reading.

 

It is usually chained to a FileReader as its source stream.

It adds the handy  readLine( ) method.

readLine( ) reads until encountering a  \n, \r, or \r\n and then stops, returning the whole line in a String.

 

Let's use this sample chart to do some stream chaining.  Let's say we wish to accomplish two things. Namely to (1) read from a character file named "myfile", and (2) buffer those reads for efficiency.

 

The chart tells us, in its Primary Function column, that FileReader gets input from "A file in the file system"  So that satisfies our first need - to read a file.

 

The next row in the table table tells us that BufferedReader "Provides buffered reading." That satisfies our second need - the buffering.  We can now go ahead and chain the two stream classes together.

 

FileReader's methods are not going to be used because it is going to be connected to another stream on its left.  BufferedReader will go on FileReader's left in the combined constructor statement. 

 

The table's first column says FileReader puts "A Reader stream" out to a sink.  And BufferedReader seems to accept the same thing - "A Reader stream" - as its source. Voila!  All we have to do is put the a FileReader stream on the right, and connect it to a BufferedReader on its left, and then use the methods from BufferedReader on the left.  The resultant stream chaining (or connecting of the pipes!) will look like this.  FileReader is now chained directly to Buffered Reader:

 

values returned  <---  use BufferedReader's read(..) methods  <---  BufferedReader <---  FileReader <---  "myfile" source

 

The combined constructor statement in Java mirrors that right-to-left sequence:

BufferedReader  br = new BufferedReader( new FileReader( “myfile” ) ) );

 

Let's do another one to see how to put together the syntax of this layering code for stream chaining.

 

layering and chaining

 

  Using the combined constructor statement as a sort of a pattern here, let's recap:

 

·         The leftmost layered constructor provides the handle and the methods.

 

·         The rightmost (innermost) constructor provides the base functionality (usually reading a file itself).

 

·         Each new stream's constructor is added to the left as it adds its own functionality to the streamed data.

 

·         Closing the leftmost stream class closes them all.

 

  In the next example let's say we want to do three things: (1) to read a file containing consecutive 4-byte int primitives, (2) to buffer the read operations, and (3) to have the different primitives (which are all run together in the actual file) returned to us individually from separate read(..) method calls.  That's three steps.  It is going to require chaining three streams together.  Here's a sample table portion showing the three streams.

 

Sends its

output to
what sink?

BYTE

INPUTSTREAM
CLASSES

Gets its input from
what source?

Primary Function:

Does it provide any unique methods

beyond those in Reader?

If connected - sends  to a Reader stream of chars.   If not connected - sends to a variable in the code.

FileInputStream

From char file from the file system

Reads raw chars from a file into a char stream

None - just the 9 methods from Reader.

If connected - sends to a Reader stream of chars.   If not - to a variable in the code.

BufferedInputStream

From a byte InputStream or InputStreams in other Charsets

Converts incoming bytes to a char stream

None - just the 9 methods from Reader.

If connected - sends to a Reader stream of chars.   If not - to a variable in the code.

DataInputStream

From Reader stream of chars

Holds and buffers the stream's chars

readLine( ), which reads until encountering \n, \r, or \r\n

and then stops, returning the line in a String.

 

  Since the first thing we want to do is read primitive data types and buffer them, it means we'll need two levels of streaming to start.  It will be just like the ones we just did.   We'll need a FileInputStream (on the right) to actually read the file data and a BufferedInputStream stream, connected or chained or layered to its left, to get that data buffered.

 

i.e. Just as before:

BufferedInputStream  br = new BufferedInputStream( new FileInputStream( “myfile” ) ) );

 

  Next we want to see nice int primitives returned to us individually. BufferedInputStream can't do that.  But the DataInputStream class has various read(..) methods which can handily return primitives one by one.  They offer up each int or double or float from the file, one at a time. So let's chain DataInputStream onto the left of the BufferedInputStream stream.  Then we'll use the readInt( ) method of DataInputStream on the handle dis. i.e.

 

DataInputStream dis = new DataInputStream( new BufferedInputStream( new FileInputStream( “myfile”)));

 

  If you wanted to add a fourth functionality - say, the additional ability to peek at and push back individual bytes - then it would all look like this:

 

PushbackInputStream pbis = new PushbackInputStream( new DataInputStream( new BufferedInputStream( new FileInputStream( "myfile" ))));

 

You would then use PushbackInputStream’s methods on the handle pbis

 

  That long combined constructor statement above could also appear this way:

 

PushbackInputStream pbis =

               new DataInputStream (

                     new BufferedInputStream (

                        new FileInputStream ( "myfile" )

                        )

                     );

 

 What we are doing in that long statement above is the same as this:

 

FileInputStream fis = new FileInputStream( "myfile" );

BufferedInputStream bis = new BufferedInputStream( fis );

DataInputStream dis = new DataInputStream( bis );

PushbackInputStream pbis = new PushbackInputStream( dis );

 

Using the combined constructor approach just explained in this site enabled creation of all the handy tables here.  You should now be ready to use them!