Improving hashCode() to match equals()  
Author Message
Marco





PostPosted: 2003-12-19 1:06:00 Top

java-programmer, Improving hashCode() to match equals() In my NamedBitField class I define equals() to mean that 2
NamedBitFields are equal if all 3 of their fields are equal.

Any suggestions for improving my hashCode() definition? I
can improve its performance by caching the computed hashCode
in a transient field, sure, but can I improve the _way_ it's
computed?


class NamedBitField {

String name; // name of bit field
int startIndex; // where it starts
int length; // how many bits it occupies

pubic boolean equals(Object obj) {
if (this == obj)
return true;
if ( obj == null || this.getClass() != obj.getClass() )
return false;
NamedBitField other = (NamedBitField) obj;
return
this.name.equals(other.name) &&
this.startIndex == other.startIndex &&
this.length == other.length;
}

public int hashCode() {

// to make this symmetric with equals() I'd prefer to
// involve all 3 fields (name, startIndex and length)
// in the computation of the hash. But how?
//
// Here I'm simply reusing String.hashCode(), but the
// hashed String at least incorporates the other
// fields
return ((name + startIndex) + length).hashCode();
}
}


Marco
----------------------------------------------------
Please remove digits from e-mail address (tr/0-9//d)

 
VisionSet





PostPosted: 2003-12-19 1:13:00 Top

java-programmer >> Improving hashCode() to match equals()
"Marco" <email***@***.com> wrote in message
news:t2lEb.557$email***@***.com...
> In my NamedBitField class I define equals() to mean that 2
> NamedBitFields are equal if all 3 of their fields are equal.
>
> Any suggestions for improving my hashCode() definition? I
> can improve its performance by caching the computed hashCode
> in a transient field, sure, but can I improve the _way_ it's
> computed?
>
>
> class NamedBitField {
>
> String name; // name of bit field
> int startIndex; // where it starts
> int length; // how many bits it occupies
>
> pubic boolean equals(Object obj) {
> if (this == obj)
> return true;
> if ( obj == null || this.getClass() != obj.getClass() )
> return false;
> NamedBitField other = (NamedBitField) obj;
> return
> this.name.equals(other.name) &&
> this.startIndex == other.startIndex &&
> this.length == other.length;
> }
>
> public int hashCode() {
>
> // to make this symmetric with equals() I'd prefer to
> // involve all 3 fields (name, startIndex and length)
> // in the computation of the hash. But how?
> //
> // Here I'm simply reusing String.hashCode(), but the
> // hashed String at least incorporates the other
> // fields
> return ((name + startIndex) + length).hashCode();
> }
> }

public int hashCode() {

int const = 17;
int h = 13;
h = h * startIndex + const;
h = h * length + const;
h = h * name.hashCode() + cost;

return h;
}

Something like that, I believe, where 17 & 13 are different prime numbers,
but it isn't that important. It doesn't matter if h overflows.

--
Mike W


 
Chris Smith





PostPosted: 2003-12-19 2:31:00 Top

java-programmer >> Improving hashCode() to match equals() VisionSet wrote:
> public int hashCode() {
>
> int const = 17;
> int h = 13;
> h = h * startIndex + const;
> h = h * length + const;
> h = h * name.hashCode() + cost;
>
> return h;
> }
>
> Something like that, I believe, where 17 & 13 are different prime numbers,
> but it isn't that important. It doesn't matter if h overflows.

Sure, something like that. Except that, IIRC, const is an unused
reserved word in Java.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
 
Marco





PostPosted: 2003-12-19 6:01:00 Top

java-programmer >> Improving hashCode() to match equals() Chris Smith wrote:
> VisionSet wrote:
>
>>public int hashCode() {
>>
>> int const = 17;
>> int h = 13;
>> h = h * startIndex + const;
>> h = h * length + const;
>> h = h * name.hashCode() + cost;
>>
>> return h;
>>}
>>
>>Something like that, I believe, where 17 & 13 are different prime numbers,
>>but it isn't that important. It doesn't matter if h overflows.

That's great, thanks. This algorithm makes the following 2
objects would return different hash codes despite their
similarity...

p.name = "foobar"; p.startIndex = 2; p.length = 5;
q.name = "foobar"; q.startIndex = 5; q.length = 2;

I can see now that adding 17 ('const' variable) to the end
of each cumulative multiplication is essential. Looks fast
too. I may even benchmark it on my little 486 :-)

I did have crazy thoughts about implementing hashCode() in
C and calling it through JNI. I guess the overhead in
dynamically linking and then marshalling arguments to a
native stack just isn't worth it for such a small piece of
code though.

> IIRC, const is an unused reserved word in Java.

Thanks Chris. You're right. const is reserved.

Marco
----------------------------------------------------
Please remove digits from e-mail address (tr/0-9//d)

 
 
greg_finch2





PostPosted: 2003-12-19 13:28:00 Top

java-programmer >> Improving hashCode() to match equals() "VisionSet" <email***@***.com> wrote in message news:<nalEb.5824$email***@***.com>...
> "Marco" <email***@***.com> wrote in message
> news:t2lEb.557$email***@***.com...
> > In my NamedBitField class I define equals() to mean that 2
> > NamedBitFields are equal if all 3 of their fields are equal.
> >
> > Any suggestions for improving my hashCode() definition? I
> > can improve its performance by caching the computed hashCode
> > in a transient field, sure, but can I improve the _way_ it's
> > computed?
> >
> >
> > class NamedBitField {
> >
> > String name; // name of bit field
> > int startIndex; // where it starts
> > int length; // how many bits it occupies
> >
> > pubic boolean equals(Object obj) {
> > if (this == obj)
> > return true;
> > if ( obj == null || this.getClass() != obj.getClass() )
> > return false;
> > NamedBitField other = (NamedBitField) obj;
> > return
> > this.name.equals(other.name) &&
> > this.startIndex == other.startIndex &&
> > this.length == other.length;
> > }
> >
> > public int hashCode() {
> >
> > // to make this symmetric with equals() I'd prefer to
> > // involve all 3 fields (name, startIndex and length)
> > // in the computation of the hash. But how?
> > //
> > // Here I'm simply reusing String.hashCode(), but the
> > // hashed String at least incorporates the other
> > // fields
> > return ((name + startIndex) + length).hashCode();
> > }
> > }
>
> public int hashCode() {
>
> int const = 17;
> int h = 13;
> h = h * startIndex + const;
> h = h * length + const;
> h = h * name.hashCode() + cost;
>
> return h;
> }
>
> Something like that, I believe, where 17 & 13 are different prime numbers,
> but it isn't that important. It doesn't matter if h overflows.

You should really use 0x11 and 0xD, respectively. Unless you are
inputting bowling scores, there's no reason for decimal!
 
 
Derek Clarkson





PostPosted: 2003-12-19 13:43:00 Top

java-programmer >> Improving hashCode() to match equals() Hi Marco,
I might be off the beam here but I noticed this piece of code in your
class:

>> return ((name + startIndex) + length).hashCode();

What occured to me is that (if I read it correctly) you have a name of
"Fred", a startIndex = 1 and length = 12, would that produce the same
result as "Fred", 11 and 2 ?

If so would

String final delim = ":";
return (name + delim + startIndex + delim + length).hashCode();

be a better solution because it separates the fields so that the hashCode()
function can see the difference ?


--
cio
Derek
 
 
Marco





PostPosted: 2003-12-19 19:20:00 Top

java-programmer >> Improving hashCode() to match equals() Derek Clarkson wrote:
>>>return ((name + startIndex) + length).hashCode();
>
> If so would
>
> String final delim = ":";
> return (name + delim + startIndex + delim + length).hashCode();
>
> be a better solution because it separates the fields so that the hashCode()
> function can see the difference ?

Thanks Derek. You're right. Concatenating the numeric
fields eliminated their distinctiveness somewhat...

name = "Fred";
startIndex = 1; // if 11 then same hash code
length = 12; // if 2 then same hash code
// calls "Fred112".hashCode()
((name + startIndex) + length).hashCode();

Instead of pushing my object's state through
String.hashCode(), I'm now incorporating it into a series
of cumulative multiplications involving prime numbers as
recommended by an earlier post in this thread.

Marco
----------------------------------------------------
Please remove digits from e-mail address (tr/0-9//d)

 
 
Alex Hunsley





PostPosted: 2003-12-19 20:13:00 Top

java-programmer >> Improving hashCode() to match equals() Alex Hunsley wrote:

> Thinkit wrote:
>
>> "VisionSet" <email***@***.com> wrote in message
>> news:<nalEb.5824$email***@***.com>...
>>
>>> "Marco" <email***@***.com> wrote in message
>>> news:t2lEb.557$email***@***.com...
>>>
>>>> In my NamedBitField class I define equals() to mean that 2
>>>> NamedBitFields are equal if all 3 of their fields are equal.
>>>>
>>>> Any suggestions for improving my hashCode() definition? I
>>>> can improve its performance by caching the computed hashCode
>>>> in a transient field, sure, but can I improve the _way_ it's
>>>> computed?
>>>>
>>>>
>>>> class NamedBitField {
>>>>
>>>> String name; // name of bit field
>>>> int startIndex; // where it starts
>>>> int length; // how many bits it occupies
>>>>
>>>> pubic boolean equals(Object obj) {
>>>> if (this == obj)
>>>> return true;
>>>> if ( obj == null || this.getClass() != obj.getClass() )
>>>> return false;
>>>> NamedBitField other = (NamedBitField) obj;
>>>> return
>>>> this.name.equals(other.name) &&
>>>> this.startIndex == other.startIndex &&
>>>> this.length == other.length;
>>>> }
>>>>
>>>> public int hashCode() {
>>>>
>>>> // to make this symmetric with equals() I'd prefer to
>>>> // involve all 3 fields (name, startIndex and length)
>>>> // in the computation of the hash. But how?
>>>> //
>>>> // Here I'm simply reusing String.hashCode(), but the
>>>> // hashed String at least incorporates the other
>>>> // fields
>>>> return ((name + startIndex) + length).hashCode();
>>>> }
>>>> }
>>>
>>>
>>> public int hashCode() {
>>>
>>> int const = 17;
>>> int h = 13;
>>> h = h * startIndex + const;
>>> h = h * length + const;
>>> h = h * name.hashCode() + cost;
>>>
>>> return h;
>>> }
>>>
>>> Something like that, I believe, where 17 & 13 are different prime
>>> numbers,
>>> but it isn't that important. It doesn't matter if h overflows.
>>
>>
>>
>> You should really use 0x11 and 0xD, respectively. Unless you are
>> inputting bowling scores, there's no reason for decimal!
>
>
> Why difference, at all, does it make whether he writes 13 or 0xD? It's
> all the same to the bytecode that you end up with.
> Also, there actually *is* a reason to use decimal - other coders who
> come along later will more readily recognise 13 and 17 as prime numbers,
> as opposed to 0x11 and 0xD. I don't know of many people that make a
> habit of knowing what prime numbers are in hex.
> (And, yes, you could convert it in your head back to decimal, but that's
> just extra cognitive load and obfuscation.)
>
> Ah, hold on, I've come across you before! You're the dude who thinks hex
> is magically better than decimal somehow, and is on a mission to convert
> the world!
>
> http://makeashorterlink.com/?C2D351AD6
>
> alex

Sorry, make that

http://makeashorterlink.com/?P2C412AD6

alex

 
 
Alex Hunsley





PostPosted: 2003-12-19 20:13:00 Top

java-programmer >> Improving hashCode() to match equals() Thinkit wrote:

> "VisionSet" <email***@***.com> wrote in message news:<nalEb.5824$email***@***.com>...
>
>>"Marco" <email***@***.com> wrote in message
>>news:t2lEb.557$email***@***.com...
>>
>>>In my NamedBitField class I define equals() to mean that 2
>>>NamedBitFields are equal if all 3 of their fields are equal.
>>>
>>>Any suggestions for improving my hashCode() definition? I
>>>can improve its performance by caching the computed hashCode
>>>in a transient field, sure, but can I improve the _way_ it's
>>>computed?
>>>
>>>
>>>class NamedBitField {
>>>
>>> String name; // name of bit field
>>> int startIndex; // where it starts
>>> int length; // how many bits it occupies
>>>
>>> pubic boolean equals(Object obj) {
>>> if (this == obj)
>>> return true;
>>> if ( obj == null || this.getClass() != obj.getClass() )
>>> return false;
>>> NamedBitField other = (NamedBitField) obj;
>>> return
>>> this.name.equals(other.name) &&
>>> this.startIndex == other.startIndex &&
>>> this.length == other.length;
>>> }
>>>
>>> public int hashCode() {
>>>
>>> // to make this symmetric with equals() I'd prefer to
>>> // involve all 3 fields (name, startIndex and length)
>>> // in the computation of the hash. But how?
>>> //
>>> // Here I'm simply reusing String.hashCode(), but the
>>> // hashed String at least incorporates the other
>>> // fields
>>> return ((name + startIndex) + length).hashCode();
>>> }
>>>}
>>
>>public int hashCode() {
>>
>> int const = 17;
>> int h = 13;
>> h = h * startIndex + const;
>> h = h * length + const;
>> h = h * name.hashCode() + cost;
>>
>> return h;
>>}
>>
>>Something like that, I believe, where 17 & 13 are different prime numbers,
>>but it isn't that important. It doesn't matter if h overflows.
>
>
> You should really use 0x11 and 0xD, respectively. Unless you are
> inputting bowling scores, there's no reason for decimal!

Why difference, at all, does it make whether he writes 13 or 0xD? It's
all the same to the bytecode that you end up with.
Also, there actually *is* a reason to use decimal - other coders who
come along later will more readily recognise 13 and 17 as prime numbers,
as opposed to 0x11 and 0xD. I don't know of many people that make a
habit of knowing what prime numbers are in hex.
(And, yes, you could convert it in your head back to decimal, but that's
just extra cognitive load and obfuscation.)

Ah, hold on, I've come across you before! You're the dude who thinks hex
is magically better than decimal somehow, and is on a mission to convert
the world!

http://makeashorterlink.com/?C2D351AD6

alex








 
 
karl





PostPosted: 2003-12-20 0:22:00 Top

java-programmer >> Improving hashCode() to match equals() email***@***.com (Thinkit) wrote in message news:<email***@***.com>...
> "VisionSet" <email***@***.com> wrote in message news:<nalEb.5824$email***@***.com>...

> > Something like that, I believe, where 17 & 13 are different prime numbers,
> > but it isn't that important. It doesn't matter if h overflows.
>
> You should really use 0x11 and 0xD, respectively. Unless you are
> inputting bowling scores, there's no reason for decimal!

???
What an odd comment. As a long time aficionado of good programming
style, this is the first time I've ever heard this particular rule of
thumb. What's inherently better about using hex literals over decimal
literals in code? Personally, if I saw 0xD in that code, it would not
be immediately obvious that it was chosen for being prime.

In fact, for purely mathematical usage, I would say you should always
use decimal. I only use hex when dealing with low-level data concepts
such as bit masks and hardware flags; i.e. situations where knowing
the bit pattern of the data is more important than knowing its numeric
value.
 
 
John





PostPosted: 2004-1-18 5:55:00 Top

java-programmer >> Improving hashCode() to match equals() Some guidance on implementing hashCode which may be useful :
http://www.javapractices.com/Topic28.cjp

>