What is meant by 'canonical representation'  
Author Message
Petterson Mikael





PostPosted: 2004-11-25 16:36:00 Top

java-programmer, What is meant by 'canonical representation' Hi,

I was reading the sun java api 1.4.2 and found the following for
String.intern().

public String intern()

Returns a canonical representation for the string object.

What is a canonical representation? Any example?

//Mikael
 
VisionSet





PostPosted: 2004-11-25 21:34:00 Top

java-programmer >> What is meant by 'canonical representation'

"Petterson Mikael" <email***@***.com> wrote in message
news:co45b0$ebk$email***@***.com...
> Hi,
>
> I was reading the sun java api 1.4.2 and found the following for
> String.intern().
>
> public String intern()
>
> Returns a canonical representation for the string object.
>
> What is a canonical representation? Any example?
>


Canonical means reduced to its simplest form.
In this context it means it returns a reference to the very same String
object itself, which will be a reference to one of the internally pooled
String objects.
The process of interning means that the object is added to the internally
maintained collection of String objects. Since String objects are immutable
and any two objects where str1.equal(Object str2) == true are equivalent and
are guaranteed to remain so. Because of this it is reasonable and more
efficient to allow any created String to be GC's and rely on the returned
value of intern().

Eg if you get a String object from the Standard input stream, this will be a
unique String object.

If you want to use it you may do

myInputString = myInputString.intern();

This will guarantee that you do not create extra copies of an identical
String.

Note that typically interning is automatic within a class as long as you
create your strings at compile time and like so: String s = "xyz";
Do not do String s = new String("xyz"); since that will not auto intern the
result.
I believe that runtime String instantiation is not auto interned.
But you would have to read the JLS for the exact mechanism.

--
Mike W


 
Chris Smith





PostPosted: 2004-11-25 22:37:00 Top

java-programmer >> What is meant by 'canonical representation' VisionSet <email***@***.com> wrote:
> Canonical means reduced to its simplest form.

It's more like "reduced to its representative form". The canonical
string is not necessarily simpler -- it's just the one that you always
get from intern(). Or, if you're a latin buff, canonical roughly means
"from the list" -- the list being the Strings in the String pool --
which works, too.

> The process of interning means that the object is added to the internally
> maintained collection of String objects. Since String objects are immutable
> and any two objects where str1.equal(Object str2) == true are equivalent and
> are guaranteed to remain so. Because of this it is reasonable and more
> efficient to allow any created String to be GC's and rely on the returned
> value of intern().

This really depends on how much comparing of the String you'll be doing.
Your post talks a lot about the cost of creating a new String, which is
O(n) with a fairly small constant on the number of characters in the
String, and not much about the cost of interning the String, which is
probably about O(log m) on the number of interned Strings in existence
in the entire application. There is a definite tradeoff there, and in
my experience String interning is a very special-purpose technique.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation