Log in

No account? Create an account
Bill Roper's Journal
We May See Murder Yet 
7th-Jan-2016 06:25 pm
Yesterday's intractable problem at work certainly appears to be a Java bug.

Take, for example, the case where you have wrapped an InputStream inside an InflaterInputStream -- in this case, an InputStream that is actually an OracleBlobInputStream, although it fails equally well should you be connecting to SQL Server. This should, in theory, allow you to read a compressed BLOB out of a database table. And as long as you are calling the read() method with no arguments and reading the stream one byte at a time, it does.

Reading the stream one byte at a time is not the most efficient way of handling it though, especially when you know that the next thing that is on the stream is a byte array of a particular length. In that case, you would like to use the read( b, off, len ) method on the stream so that you can read multiple bytes from the stream in one operation. In fact, you should be able to set the offset to 0 and the length to the length of the array and you can do this in one beautiful call.

Most of the time.

There's something that's busted inside the native code implementation of Inflater.inflateBytes so that it can't manage to make this work all of the time. I strongly suspect that it hits a boundary between blocks that the Deflater wrote out and just stops in mid-read so that a call that should return, say, 53 bytes of data only returns 10 bytes. The rest of the data is there, but your read( b, off, len ) method simply isn't going to let you have it.

I apparently ran into this problem in June of 2014 and coded around it at the time by reading the byte array one byte at a time. But then I forgot about it and wrote some more code that gleefully tries to read the byte array in one operation.

And it works.

Most of the time.

Sadly, most of the time is not nearly good enough.

*mutter* *mutter*
8th-Jan-2016 12:40 am (UTC)
The most painful bugs are the ones that work most of the time.
8th-Jan-2016 01:03 am (UTC)
Looking really carefully at the documentation, it appears that this perverse behavior is intentional. Apparently, I should be writing my own loop to keep re-reading until I get the full length of the array.

The default behavior of InputStream does what I want, but no derived implementation is guaranteed to behave that way.
8th-Jan-2016 11:04 am (UTC)
Is it one of those cases where the InputStream only returns some of the bytes on the first try and tells you how many it returned? That's a legitimate behavior for an InputStream, but one that's easy to overlook.
8th-Jan-2016 04:36 pm (UTC)
Given the documentation, I would have to agree that it's easy to overlook. It's made worse by the fact that the default behavior for an InputStream is to return all the bytes that you asked for if they're available.

I'm sure that not specifying the default behavior as required makes things easier for people developing variant InputStream implementations, but it's not clear to me that it improves life for someone writing code that uses an InputStream.
8th-Jan-2016 01:56 am (UTC)
You've stumbled on my dad's biggest gripe about Object Oriented systems (or a big part of it). OO results in using lots of libraries that you really don't have any control of - or understanding of what they do.
This page was loaded Sep 23rd 2018, 11:37 pm GMT.