June 29th, 2014

Hashing It Out

Thinking some more about yesterday's Java problem:

In almost every case, for an "ID"-type class, two different instances of the ID that contain the same identifiers should return true when passed to the equals() method and should return identical hash code values when the hashCode() method is called. The problem is that the standard equals() and hashCode() methods that are inherited from the Object class will treat these as different objects. If an ID is supposed to uniquely identify some Object (for example, a PersonID that uniquely identifies a Person), only one instance of the ID will work to identify the unique Object if the ID is used as a key in a HashMap or the like, because a different instance of the ID, although conceptually equal, will generate a different hash code and will return false in an equals() comparison.

The solution, then, is to override the equals() and hashCode() methods for the ID classes so that IDs that are conceptually equal will return identical hash code values and will return true in an equals() comparison. This is, in fact, the case by default if the ID is simply a String.

So let's override the hashCode() and equals() methods for the ID classes. Life is good.

Except now there's no way to identify different instances of your ID object. And when you're writing a serialized stream that may contain duplicate references to the same object, you want to know if you've written an object to the stream before, so you can write the object once and restore the duplicate references to the object when reading the serialized stream back into memory. Normally, you could use a HashMap with the object as a key to keep track of this, but now two different instances of the ID will return the same hash code and will return true for equals(), so different instances are interpreted as being the same instance.

It looks like the right answer to this is to define an interface that all of the serializable objects can implement that will, by default, call back to the default object implementations of hashCode() and equals() (let's call them objectHashCode() and objectEquals()); then implement an ObjectHashMap that will call objectHashCode() and objectEquals() instead of hashCode() and equals().


This is an annoying amount of work to do...