Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write?  
Author Message
Chris Uppal





PostPosted: 2006-9-15 23:39:00 Top

java-programmer, Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? Patricia Shanahan wrote:

[me:]

> > /Anything/ would be better than that. Throw a runtime exception -- at
> > least that would guarantee that you weren't depending on having a
> > working hashCode() without realising it.
>
> It is a functionally correct hashCode.

Only in the minimal sense that it abides by its contract. It not, however,
useable for anything; therefore it only exists (or should only exist) as a
placeholder for a method which the author assumes will not be needed. But
that's a major risk -- sooner or later someone else is likely[*] to start using
these things in HashTables, and then they get poor performance without knowing
why (or perhaps even without realising that they are getting unecessarily poor
performance). If the assumption is that the method is only a placeholder, then
it would be much safer (as I said) to throw an unchecked exception.

If any programmer /I/ worked with was in the habit of leaving these little
timebombs ticking away in the codebase, then I'd be inclined to throw a Very
Serious Wobbly.

What you think of a programmer who habitually wrote:

void toString() { return ""; }

? Yet that's a damn sight safer (and no less "reasonable") than always
returning zero from hashCode().


> I assume that if Mark found poor hash table performance during
> profiling,

As an aside, profiling is not a good tool for recognising or diagnosing poor
hashing functions. And, although it is a reasonable tool for confirming that a
non-too-special hash function is "good enough", even that is somewhat unsafe
unless you know that your test data is representative of real world data /in
the way it interacts with the hash/.

-- chris

[*] "likely" assuming normal operation of Sod's Law.


 
Chris Uppal





PostPosted: 2006-9-15 23:39:00 Top

java-programmer >> Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? Patricia Shanahan wrote:

[me:]

> > /Anything/ would be better than that. Throw a runtime exception -- at
> > least that would guarantee that you weren't depending on having a
> > working hashCode() without realising it.
>
> It is a functionally correct hashCode.

Only in the minimal sense that it abides by its contract. It not, however,
useable for anything; therefore it only exists (or should only exist) as a
placeholder for a method which the author assumes will not be needed. But
that's a major risk -- sooner or later someone else is likely[*] to start using
these things in HashTables, and then they get poor performance without knowing
why (or perhaps even without realising that they are getting unecessarily poor
performance). If the assumption is that the method is only a placeholder, then
it would be much safer (as I said) to throw an unchecked exception.

If any programmer /I/ worked with was in the habit of leaving these little
timebombs ticking away in the codebase, then I'd be inclined to throw a Very
Serious Wobbly.

What you think of a programmer who habitually wrote:

void toString() { return ""; }

? Yet that's a damn sight safer (and no less "reasonable") than always
returning zero from hashCode().


> I assume that if Mark found poor hash table performance during
> profiling,

As an aside, profiling is not a good tool for recognising or diagnosing poor
hashing functions. And, although it is a reasonable tool for confirming that a
non-too-special hash function is "good enough", even that is somewhat unsafe
unless you know that your test data is representative of real world data /in
the way it interacts with the hash/.

-- chris

[*] "likely" assuming normal operation of Sod's Law.


 
Patricia Shanahan





PostPosted: 2006-9-15 23:44:00 Top

java-programmer >> Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? Thomas Hawtin wrote:
> Matt Humphrey wrote:
>>
>> My question is, what is a good first implementation of compound hashCode?
>> For objects that have simple fields a, b, c I tend to use
>>
>> a.hashCode ^ b.hashCode ^ c.hashCode
>
> That doesn't work well if the hash codes of a, b and c are often equal,
> a, b and c are often permutations, or whatever.
>
> You could take you lead from String and do something like:
>
> @Override
> public int hashCode() {
> int h = a.hashCode();
> h = h*31 + b.hashCode();
> h = h*31 + c.hashCode();
> return h;
> }

That is what I usually do.

Patricia
 
 
Chris Uppal





PostPosted: 2006-9-16 0:20:00 Top

java-programmer >> Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? Matt Humphrey wrote:

> My question is, what is a good first implementation of compound hashCode?
> For objects that have simple fields a, b, c I tend to use
>
> a.hashCode ^ b.hashCode ^ c.hashCode
>
> And for compositions, accumulating hashCodes of elements over ^.

Multiplication isn't such a good idea if any of the elements has a tendency to
be zero. Xor is also not such a good idea unless it is combined with
bit-shifting to mix the bits up (and also to make the hash sensitive to the
order of elements in the collection -- which is normally, but not always, what
you want).

A couple of approaches:

(a * P + b) * P + c (where P is some prime)
((a << S) ^ b) << S) ^ c (where S is some small integer)

Either can be generalised to iterating over collections. But the second should
be modified to catch the bits as they are shifted off the high end and xor-ing
then back in at the low end, otherwise only the last few elements in the
collection will contribute to the hash. For example:

int
hash(int[] data)
{
int hash = 0
for (int i : data)
{
int wrapBit = hash >>> 31;
hash = (hash << 1) ^ i ^ wrapBit;
}
}

There are lots of possible modifications.

-- chris



 
 
Tor Iver Wilhelmsen





PostPosted: 2006-9-16 1:47:00 Top

java-programmer >> Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? "Danno" <email***@***.com> writes:

> Just curious

toString(): Yes, for evey case when a bean is used in a context where
a textual representation is needed.

hashCode()/equals(): Yes, for every bean used in e.g. Hibernate
mappings, esp. composite key classes which are mandated by Hibernate
to override them.
 
 
Richard Wheeldon





PostPosted: 2006-9-16 2:03:00 Top

java-programmer >> Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? Danno wrote:
> Just curious
>

toString() : frequently, usually for debug information and for objects
which will be used directly in a GUI (e.g. nodes in trees, entries in
lists, comboboxes, etc.)

equals() : Usually only for business data objects, which count for a
fairly small number of classes compared to those related to the gui,
reports, database access, etc. I always implement equals when
implementing Comparable though.

hashCode() : Most of the times when I implement equals().

Fwiw, in my (rather old) copy of Sun's JDK source code I found 455
toString()s, 377 equals()s and 265 hashCode()s out of around 4000 source
files.

Richard
 
 
Mark Jeffcoat





PostPosted: 2006-9-16 2:08:00 Top

java-programmer >> Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? "Chris Uppal" <email***@***.com> writes:

>
> As an aside, profiling is not a good tool for recognising or diagnosing poor
> hashing functions. And, although it is a reasonable tool for confirming that a
> non-too-special hash function is "good enough", even that is somewhat unsafe
> unless you know that your test data is representative of real world data /in
> the way it interacts with the hash/.
>


Since the constant hashing function always shows up in
the same way in the profiler (an unusually large number
of equals() comparisons, as your HashMap degrades into
a list), I think the profiler is a perfectly good tool.

It's also perfectly safe; you'll never get the wrong
answer this way.

The real win comes from two places: first, that you haven't
wasted any time or mental effort solving a problem that
wouldn't have done you any good anyway. Second, when you
do have a need for a real hash function, the performance
will be so miserable that you can't miss it, and you'll
exactly where to concentrate your effort. If you'd started
with something that's pertty good, that bright red flag
won't ever be there, and you may have a harder time spotting
your opportunity.

In my recent experience, I need a real hashCode about 5% of
the time. For the other 95%, I'd be more productive just
setting $20 bills on fire.

(Oooh: just thought of a third way the constant hash function
wins: it's really, really hard to screw it up. God help you
if you write a hashCode() that's inconsistent with equals().)


--
Mark Jeffcoat
Austin, TX
 
 
richardsosborn@gmail.com





PostPosted: 2006-9-16 2:29:00 Top

java-programmer >> Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? Not only that but test harness, junit class, class diagram and javadocs
before ANY methods are created.

;) (not really)

 
 
AndrewMcDonagh





PostPosted: 2006-9-16 5:45:00 Top

java-programmer >> Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? Mark Jeffcoat wrote:
> "Danno" <email***@***.com> writes:
>
>> Just curious
>>
>
> You'll know when you need equals()... most things
> don't get compared, so there's usually no reason to
> bother.
>
> Once you've put in your own equals() method, you really
> need to override hashCode(), too. But pre-mature optimization
> is evil, so here's the one I always use pre-profiling:
>
> public int hashCode() {
> return 0;
> }
>
>
> I'm not kidding.
>
>

If thats your over ridden version - your application would be better
served not doing anything. At least that way you may get some
distribution of the objects within a Hash. With your current approach
your Hashes would always only contain one bucket full of every instance.

You are in effect causing a performance problem - not optimizing a slow one.
 
 
Patricia Shanahan





PostPosted: 2006-9-16 5:52:00 Top

java-programmer >> Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? AndrewMcDonagh wrote:
> Mark Jeffcoat wrote:
>> "Danno" <email***@***.com> writes:
>>
>>> Just curious
>>>
>>
>> You'll know when you need equals()... most things
>> don't get compared, so there's usually no reason to bother.
>> Once you've put in your own equals() method, you really
>> need to override hashCode(), too. But pre-mature optimization
>> is evil, so here's the one I always use pre-profiling:
>>
>> public int hashCode() {
>> return 0;
>> }
>>
>>
>> I'm not kidding.
>>
>>
>
> If thats your over ridden version - your application would be better
> served not doing anything. At least that way you may get some
> distribution of the objects within a Hash. With your current approach
> your Hashes would always only contain one bucket full of every instance.

Distributing the objects within the hash is very, very bad unless you
ensure that any pair of equal objects have the same hash code.

>
> You are in effect causing a performance problem - not optimizing a slow
> one.

He is not trying to optimize. He is trying to maintain program
correctness, and do optimization later, if it is needed.

Patricia

 
 
Jeffrey Schwab





PostPosted: 2006-9-16 7:01:00 Top

java-programmer >> Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write? AndrewMcDonagh wrote:
> Mark Jeffcoat wrote:
>> "Danno" <email***@***.com> writes:
>>
>>> Just curious
>>>
>>
>> You'll know when you need equals()... most things
>> don't get compared, so there's usually no reason to bother.
>> Once you've put in your own equals() method, you really
>> need to override hashCode(), too. But pre-mature optimization
>> is evil, so here's the one I always use pre-profiling:
>>
>> public int hashCode() {
>> return 0;
>> }
>>
>>
>> I'm not kidding.
>>
>>
>
> If thats your over ridden version - your application would be better
> served not doing anything.

If you override equals(), you're supposed to override hashCode() as
well, to ensure that equal objects have equal hash codes.

> At least that way you may get some
> distribution of the objects within a Hash. With your current approach
> your Hashes would always only contain one bucket full of every instance.

He knows.

> You are in effect causing a performance problem - not optimizing a slow
> one.

This is pre-profiling. He's effectively letting his hash use just one
bucket, trading performance (from near-constant to O(n) complexity) for
simplicity. I'm not sure I like it, but I appreciate the logic.