benchmarks? java vs .net (binarytrees)  
Author Message
Razii





PostPosted: 2008-6-8 9:07:00 Top

java-programmer, benchmarks? java vs .net (binarytrees) On Sat, 7 Jun 2008 17:44:18 -0700 (PDT), kwikius
<email***@***.com> wrote:

>Java, .NET starts out slow buddy.. King Twat, Mr Skeet, Mr Harrop and
>you shoved a lot of time, figuring out what we already know for last
>10 years. Your only option... you turn off GC. Fucking unacceptable
>for any multi user system. Only way to get any sort of performance out
>of GC system is to hog the system.
>
>Have a nice fuckin' day.
>
>Oh and Fuck Off Ratboy..

lol ...

I added the C++ group back. If you are going to troll, keep all the
groups on the list.

Or are you ashamed?


 
Razii





PostPosted: 2008-6-8 9:07:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) On Sat, 7 Jun 2008 17:44:18 -0700 (PDT), kwikius
<email***@***.com> wrote:

>Java, .NET starts out slow buddy.. King Twat, Mr Skeet, Mr Harrop and
>you shoved a lot of time, figuring out what we already know for last
>10 years. Your only option... you turn off GC. Fucking unacceptable
>for any multi user system. Only way to get any sort of performance out
>of GC system is to hog the system.
>
>Have a nice fuckin' day.
>
>Oh and Fuck Off Ratboy..

lol ...

I added the C++ group back. If you are going to troll, keep all the
groups on the list.

Or are you ashamed?


 
Jon Skeet [C# MVP]





PostPosted: 2008-6-8 15:39:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Jon Harrop <email***@***.com> wrote:
> >> My claim was clearly perfectly correct.
> >
> > Rubbish. You claimed that Razii had turned the garbage collector off.
>
> Now you have resorted to misquoting me. I think that says it all.

You've applied different options to the program to make it appear that
your claim about Razii's options were correct. I missed out the word
"effectively" above. Which of those is more important?

Similarly, you've later managed to quote something I wrote about *your*
set of options as if I was writing it about Razii's set of options.

> > He certainly hadn't, or the program would not have run to completion,
> > limited as it was by his options to 512MB.
>
> For n=20?

Yes.

> > Just to make it absolutely clear, here is how Razii was running the
> > code:
> >
> > java -server -verbose:gc -Xms512m -Xmx512m -XX:NewRatio=1
>
> You have not specified "n" but it appears that your entire line of thinking
> revolves around n=20.

Yes, which is what he ran.

The post where you claimed Razii had effectively claimed the garbage
collector off was message ID
<email***@***.com>

If you look back up the thread directly from that to the last time
Razii had specified options (i.e. the run that was under discussion) it
was message ID <email***@***.com>.

The exactly command line specified was:

$ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 20

> > With those options, the garbage collector *does* run, and *does*
> > collect memory.
>
> Here is another trivial counter example using Razii's arguments as you
> quoted them:

I only quoted the memory part because that's the only part I *saw* you
change. However, if you look

> $ java -server -verbose:gc -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 13
> stretch tree of depth 14 check: -1
> 16384 trees of depth 4 check: -16384
> 4096 trees of depth 6 check: -4096
> 1024 trees of depth 8 check: -1024
> 256 trees of depth 10 check: -256
> 64 trees of depth 12 check: -64
> long lived tree of depth 13 check: -1
>
> As you can see, the GC never ran.
>
> > Your claim was absolutely incorrect, and your attempt
> > to confuse the matter by posting other options which *did* negate the
> > need for the garbage collector to run does not in any way change the
> > options under which Razii ran his test.
>
> I said "the GC is effectively off". You say "Rubbish... the need for the GC
> to run had been negated.". The difference is academic.

I would agree - but you're quoting me in entirely the wrong context.
Let's have a look at the statement where I talked about "the need for
the GC to had had been negated":

<quote>
Your claim was absolutely incorrect, and your attempt
to confuse the matter by posting other options which *did* negate the
need for the garbage collector to run does not in any way change the
options under which Razii ran his test.
</quote>

Oh look, it's in the context of *your* options, not Razii's.

I never claimed, nor *would* I claim, that the need for the GC to run
had been negated with Razii's options. Those are the only options which
I think should be considered in this part of the discussion, as those
are the options for which you originally claimed that the GC had been
effectively turned off.

> > Now rather than just repeatedly stating your claim, or arguing by using
> > *different* options, please address the steps above. Which of the 5
> > facts/deductions above do you disagree with? The conclusion directly
> > contradicts your claim, so you should either retract your claim or
> > refute the logic above.
>
> The main problem is with your interpretation of the word "effectively". You
> seem to think that you can add and remove this word at will without
> affecting the meaning of a sentence when, in fact, you cannot.

In this case it doesn't change things. If the garbage collector had
*effectively* been turned off, it would not have been able to run to
completion.

> Consequently, your conclusion (5) is wrong. It should be "Therefore the GC
> had *not* been turned off". No disagreement here. But that says nothing
> about my original statement.

No way. You can't realistically claim a garbage collector has been
"effectively" turned off when it being turned *on* is critical to the
program running to completion. What exactly do you take "effectively"
to mean? I take it to mean "to the same effect". Now I'm happy for
"effect" to only mean in terms of computed results rather than
performance - so a fast program can be effectively the same as a slow
program - but it can't be in terms of completing the run. Options where
the program fails to run to completion are *not* "effectively" the same
as options where the program runs fine.

> There were two sides to my original point. Firstly, if you manually tweak
> the GC parameters by hand for one specific input on one specific machine
> then you are doing manual memory management.

No - you're changing the configuration options to let the automatic
memory management work more effectively. Note how I did exactly the
same to make the .NET GC use a different implementation - and again, it
sped things up.

> Garbage collection means *automatic* memory management. So you can
> kiss goodbye to the idea of claiming that your GC is fast (which is
> exactly what Razii was trying to do). The same goes for explicitly
> calling the GC from within your code (it is a form of manual memory
> management).

Well, it's a hint to the garbage collector - a hint which is usually
unnecessary, but *can* occasionally be beneficial. There's a long, long
way from that to fully manual memory management though.

> Secondly, Razii's technique and results for this benchmark have absolutely
> no bearing on reality whatsoever. Indeed, I cannot even reproduce his
> results using the same program with the same input on a slightly different
> machine. So let's not pretend this is of any practical relevance.

I never have.

> All you have managed to do is optimize a flawed benchmark which, as I said
> from the beginning, is completely fruitless.

No argument there - but then that's not what I was responding to, was
it?


Let me make this absolutely crystal clear, so that so long as you quote
from this sentence down in your reply, everything else above is
irrelevant:

Given a run of the code with these options:

$ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 20

You claimed that Razii effectively turned the GC off. I claim that he
certainly didn't, because with the GC having no effect the program
would not have completed.

Do you disagree with my claim that with the GC *actually* turned off
(if there were some way to do that) the program would have failed to
finish, or do you think that a (hypothetical) set of options where a
program fails to finish can be *effectively* the same as a set of
options where the program manages to run?

--
Jon Skeet - <email***@***.com>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon.skeet
C# in Depth: http://csharpindepth.com
 
 
Mark Thornton





PostPosted: 2008-6-8 19:35:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) kwikius wrote:
> On Jun 7, 8:38 pm, Arne Vajh鴍 <email***@***.com> wrote:
> <...>
>
> Java uses virtual memory. If there is not
>> enough physical memory to cover then it still works - it just becomes
>> very slow.
>
> Java, .NET starts out slow buddy.. King Twat, Mr Skeet, Mr Harrop and
> you shoved a lot of time, figuring out what we already know for last
> 10 years. Your only option... you turn off GC. Fucking unacceptable
> for any multi user system.

> Only way to get any sort of performance out
> of GC system is to hog the system.

In real systems this is usually not true. In most applications it is
possible to get good performance from the garbage collector. In
particular, Java provides many options for tuning the garbage collector
(if you think you can do better than the defaults). Razii may be
irritating, but responding with untruths of your own is not helpful.

Mark Thornton
 
 
Razii





PostPosted: 2008-6-8 21:19:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) On Sun, 08 Jun 2008 12:34:34 +0100, Mark Thornton
<email***@***.com> wrote:

>In real systems this is usually not true. In most applications it is
>possible to get good performance from the garbage collector.

It's not only not true. It's downright false.

http://www.idiom.com/~zilla/Computer/javaCbenchmark.html

Consider what happens when you do a new/malloc: a) the allocator looks
for an empty slot of the right size, then returns you a pointer. b)
This pointer is pointing to some fairly random place.

With GC, a) the allocator doesn't need to look for memory, it knows
where it is, b) the memory it returns is adjacent to the last bit of
memory you requested. The wandering around part happens not all the
time but only at garbage collection. And then (depending on the GC
algorithm) things get moved of course as well.


The cost of missing the cache
The big benefit of GC is memory locality. Because newly allocated
memory is adjacent to the memory recently used, it is more likely to
already be in the cache.

How much of an effect is this? One rather dated (1993) example shows
that missing the cache can be a big cost: changing an array size in
small C program from 1023 to 1024 results in a slowdown of 17 times
(not 17%). This is like switching from C to VB! This particular
program stumbled across what was probably the worst possible cache
interaction for that particular processor (MIPS); the effect isn't
that bad in general...but with processor speeds increasing faster than
memory, missing the cache is probably an even bigger cost now than it
was then.

(It's easy to find other research studies demonstrating this; here's
one from Princeton: they found that (garbage-collected) ML programs
translated from the SPEC92 benchmarks have lower cache miss rates than
the equivalent C and Fortran programs.)

This is theory, what about practice? In a well known paper [2] several
widely used programs (including perl and ghostscript) were adapted to
use several different allocators including a garbage collector
masquerading as malloc (with a dummy free()). The garbage collector
was as fast as a typical malloc/free; perl was one of several programs
that ran faster when converted to use a garbage collector. Another
interesting fact is that the cost of malloc/free is significant: both
perl and ghostscript spent roughly 25-30% of their time in these
calls.

Besides the improved cache behavior, also note that automatic memory
management allows escape analysis, which identifies local allocations
that can be placed on the stack. (Stack allocations are clearly
cheaper than heap

 
 
Rudy Velthuis





PostPosted: 2008-6-8 22:10:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Razii wrote:

> Consider what happens when you do a new/malloc: a) the allocator looks
> for an empty slot of the right size, then returns you a pointer.

Modern allocators have several arrays of slots of suitable sizes, and
can therefore easily find one in the right size. The next allocation of
that size will also be immediately adjacent. Only rather large sizes
require another approach, but I assume these are pretty rare in both
kinds of environments, and I guess that programs tend to hang on to
such large objects much longer as well. Deallocation of objects is
immediate, which often means that memory consumption is lower and not
dependent on when a GC might finally run. Also, no heaps of memory are
moved around in non-GC memory management.

IOW, there are arguments for both approaches. The GC one has the big
advantage that one big cause of errors, all errors regarding memory
use, are more or less completely eliminated. But I doubt I would call
speed one of the main factors to choose a GC.
--
Rudy Velthuis http://rvelthuis.de

"My last cow just died, so I won't need your bull anymore."
 
 
Lew





PostPosted: 2008-6-8 22:42:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Razii wrote:
> but with processor speeds increasing faster than
> memory, missing the cache is probably an even bigger cost now than it
> was then.

You might not have noticed, but processor speeds have been flat for the last
several years, or actually declined. Memory has gotten faster, and CPUs have
gotten more cache, so actually the trend is the opposite of what you stated.

--
Lew
 
 
Lew





PostPosted: 2008-6-8 22:45:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Rudy Velthuis wrote:
> Modern allocators have several arrays of slots of suitable sizes, and
> can therefore easily find one in the right size. The next allocation of
> that size will also be immediately adjacent. Only rather large sizes
> require another approach, but I assume these are pretty rare in both
> kinds of environments, and I guess that programs tend to hang on to
> such large objects much longer as well. Deallocation of objects is
> immediate, which often means that memory consumption is lower and not
> dependent on when a GC might finally run. Also, no heaps of memory are
> moved around in non-GC memory management.

Deallocation of young objects in Java takes no time at all. GCs of the young
generation take very little time for typical memory-usage patterns. It could
be, for a large class of programs, that memory management takes less time in a
GCed language like Java than in a language like C++ with manual memory management.

> IOW, there are arguments for both approaches. The GC one has the big
> advantage that one big cause of errors, all errors regarding memory
> use, are more or less completely eliminated. But I doubt I would call
> speed one of the main factors to choose a GC.

It can be. The problem is that assertions about speed are nearly impossible
to make /a priori/ - there are so many factors and emergent interactions
involved that one is unlikely to guess correctly without experimentation and
measurement.

--
Lew
 
 
QXJuZSBWYWpow7hq





PostPosted: 2008-6-8 23:09:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Lew wrote:
> Razii wrote:
>> but with processor speeds increasing faster than
>> memory, missing the cache is probably an even bigger cost now than it
>> was then.
>
> You might not have noticed, but processor speeds have been flat for the
> last several years, or actually declined.

Processor speed is increasing at the same speed as ever.

GHz rates are not. They reached the heat barrier. But GHz was
never a good indication for speed.

Growth in core speed has slowed down, because the the way processors
get faster today is to add more cores.

Arne

 
 
Lew





PostPosted: 2008-6-9 0:05:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Arne Vajh酶j wrote:
> Lew wrote:
>> Razii wrote:
>>> but with processor speeds increasing faster than
>>> memory, missing the cache is probably an even bigger cost now than it
>>> was then.
>>
>> You might not have noticed, but processor speeds have been flat for
>> the last several years, or actually declined.
>
> Processor speed is increasing at the same speed as ever.
>
> GHz rates are not. They reached the heat barrier. But GHz was
> never a good indication for speed.
>
> Growth in core speed has slowed down, because the the way processors
> get faster today is to add more cores.

I believe you're speaking of processing speed. The term "processor speed" has
always meant clock speed of a processor in every context I've encountered it
heretofore.

Adding cores to a processor doesn't inherently make it faster. The software
has to take advantage of the additional cores.

--
Lew
 
 
QXJuZSBWYWpow7hq





PostPosted: 2008-6-9 2:19:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Lew wrote:
> Arne Vajh酶j wrote:
>> Lew wrote:
>>> Razii wrote:
>>>> but with processor speeds increasing faster than
>>>> memory, missing the cache is probably an even bigger cost now than it
>>>> was then.
>>>
>>> You might not have noticed, but processor speeds have been flat for
>>> the last several years, or actually declined.
>>
>> Processor speed is increasing at the same speed as ever.
>>
>> GHz rates are not. They reached the heat barrier. But GHz was
>> never a good indication for speed.
>>
>> Growth in core speed has slowed down, because the the way processors
>> get faster today is to add more cores.
>
> I believe you're speaking of processing speed. The term "processor
> speed" has always meant clock speed of a processor in every context I've
> encountered it heretofore.

Could be. But that meaning does not fit very well with the original
context.

> Adding cores to a processor doesn't inherently make it faster. The
> software has to take advantage of the additional cores.

It makes it potential faster.

It is up to the programmers to utilize the potential.

Arne
 
 
Lew





PostPosted: 2008-6-9 2:36:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Lew wrote:
>> Adding cores to a processor doesn't inherently make it faster. The
>> software has to take advantage of the additional cores.

Arne Vajh酶j wrote:
> It makes it potential faster.

And most OSes do manage to use at least some of that potential.

> It is up to the programmers to utilize the potential.

I agree completely.

I also see a trend for more and more programs to at least allow its use. CPUs
are also getting more and faster cache memory, and mainboard memory
utilization also is getting faster, so the OP's assertion that "processor
speeds [are] increasing faster than memory" becomes a little less generally
reliable.

Regardless. you and the OP together are clearly correct that processing speed
is getting much faster, as is memory speed. Taken together, along with
implications of multi-processor algorithms on the memory model, there are
great effects on the state of the art in programming.

Nit-picking about specific minor terminologies aside, your conclusions are
inarguable.

--
Lew
 
 
Jon Harrop





PostPosted: 2008-6-9 4:54:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Razii wrote:
> On Sun, 08 Jun 2008 12:34:34 +0100, Mark Thornton
> <email***@***.com> wrote:
>>In real systems this is usually not true. In most applications it is
>>possible to get good performance from the garbage collector.
>
> It's not only not true. It's downright false.
>
> http://www.idiom.com/~zilla/Computer/javaCbenchmark.html
>
> Consider what happens when you do a new/malloc: a) the allocator looks
> for an empty slot of the right size, then returns you a pointer. b)
> This pointer is pointing to some fairly random place.
> ...

This is just another strawman argument. Malloc is not the only alternative
to GC.

> With GC, a) the allocator doesn't need to look for memory, it knows
> where it is, b) the memory it returns is adjacent to the last bit of
> memory you requested. The wandering around part happens not all the
> time but only at garbage collection. And then (depending on the GC
> algorithm) things get moved of course as well.

That is exactly that the STL allocators do, for example.

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
 
 
Jon Harrop





PostPosted: 2008-6-9 5:03:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Lew wrote:
> Rudy Velthuis wrote:
>> Modern allocators have several arrays of slots of suitable sizes, and
>> can therefore easily find one in the right size. The next allocation of
>> that size will also be immediately adjacent. Only rather large sizes
>> require another approach, but I assume these are pretty rare in both
>> kinds of environments, and I guess that programs tend to hang on to
>> such large objects much longer as well. Deallocation of objects is
>> immediate, which often means that memory consumption is lower and not
>> dependent on when a GC might finally run. Also, no heaps of memory are
>> moved around in non-GC memory management.
>
> Deallocation of young objects in Java takes no time at all...

You are ignoring all of the overheads of a GC, like thread synchronization,
stack walking and limitations placed upon the code generator required to
keep the GC happy.

If you compare generically and assume infinite development time then
lower-level languages will surely win in terms of raw performance. The
reason the world moved on to GC'd languages is that they allow more
complicated programs to be written more robustly and efficiently in a given
amount of development time, i.e. they are more cost effective.

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
 
 
Lew





PostPosted: 2008-6-9 5:19:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Jon Harrop wrote:
> You are ignoring all of the overheads of a GC, like thread synchronization,
> stack walking and limitations placed upon the code generator required to
> keep the GC happy.

Balanced, to a degree at least, by the absence of manual memory-management
code, which would also have an overhead of its own, and the presence of
dynamic optimizers like Hotspot.

> If you compare generically and assume infinite development time then
> lower-level languages will surely win in terms of raw performance. The
> reason the world moved on to GC'd languages is that they allow more
> complicated programs to be written more robustly and efficiently in a given
> amount of development time, i.e. they are more cost effective.

Your points are well taken, but all I'm saying is that /a priori/ arguments
about the overhead of GC are not reliable. The advantages to performance that
GC brings tend to reduce the overhead of collections. Which one wins depends
so much on details of the JVM implementation, the needs of the algorithm, the
idioms followed by the app programmer, and other factors that it seems the
height of hubris to predict without measurement.

So far it seems that you must be at least mostly correct - from what I've seen
and read, most Java programs on most JVMs still seem to be somewhat slower
than most "natively compiled" programs. However, the gap has unequivocally
lessened over the years, and one can easily see it tilting the way of the
intelligently GCed platform.

--
Lew
 
 
Arne Vajh鴍





PostPosted: 2008-6-9 11:04:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Rudy Velthuis wrote:
> Razii wrote:
>> Consider what happens when you do a new/malloc: a) the allocator looks
>> for an empty slot of the right size, then returns you a pointer.
>
> Modern allocators have several arrays of slots of suitable sizes, and
> can therefore easily find one in the right size. The next allocation of
> that size will also be immediately adjacent. Only rather large sizes
> require another approach, but I assume these are pretty rare in both
> kinds of environments, and I guess that programs tend to hang on to
> such large objects much longer as well. Deallocation of objects is
> immediate, which often means that memory consumption is lower and not
> dependent on when a GC might finally run. Also, no heaps of memory are
> moved around in non-GC memory management.
>
> IOW, there are arguments for both approaches. The GC one has the big
> advantage that one big cause of errors, all errors regarding memory
> use, are more or less completely eliminated. But I doubt I would call
> speed one of the main factors to choose a GC.

Actually GC speed is very good.

The problem people complain over is the non deterministic
aspect of it.

Arne
 
 
Arne Vajh鴍





PostPosted: 2008-6-9 11:07:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Jon Harrop wrote:
> Lew wrote:
>> Rudy Velthuis wrote:
>>> Modern allocators have several arrays of slots of suitable sizes, and
>>> can therefore easily find one in the right size. The next allocation of
>>> that size will also be immediately adjacent. Only rather large sizes
>>> require another approach, but I assume these are pretty rare in both
>>> kinds of environments, and I guess that programs tend to hang on to
>>> such large objects much longer as well. Deallocation of objects is
>>> immediate, which often means that memory consumption is lower and not
>>> dependent on when a GC might finally run. Also, no heaps of memory are
>>> moved around in non-GC memory management.
>> Deallocation of young objects in Java takes no time at all...
>
> You are ignoring all of the overheads of a GC, like thread synchronization,
> stack walking and limitations placed upon the code generator required to
> keep the GC happy.

I would expect non-GC solutions to need more thread synchronization
than GC because it will need it many more times.

> If you compare generically and assume infinite development time then
> lower-level languages will surely win in terms of raw performance. The
> reason the world moved on to GC'd languages is that they allow more
> complicated programs to be written more robustly and efficiently in a given
> amount of development time, i.e. they are more cost effective.

I agree with that part.

Arne
 
 
Arne Vajh鴍





PostPosted: 2008-6-9 11:09:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) kwikius wrote:
> On Jun 7, 8:38 pm, Arne Vajh鴍 <email***@***.com> wrote:
>> Java uses virtual memory. If there is not
>> enough physical memory to cover then it still works - it just becomes
>> very slow.
>
> Java, .NET starts out slow buddy.. King Twat, Mr Skeet, Mr Harrop and
> you shoved a lot of time, figuring out what we already know for last
> 10 years. Your only option... you turn off GC. Fucking unacceptable
> for any multi user system. Only way to get any sort of performance out
> of GC system is to hog the system.

My I recommend you to read a beginners book about operating
systems ?

You seem in need for some basic understanding of what
virtual memory means.

Arne
 
 
Rudy Velthuis





PostPosted: 2008-6-9 19:30:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Arne Vajh鴍 wrote:

> > IOW, there are arguments for both approaches. The GC one has the big
> > advantage that one big cause of errors, all errors regarding memory
> > use, are more or less completely eliminated. But I doubt I would
> > call speed one of the main factors to choose a GC.
>
> Actually GC speed is very good.
>
> The problem people complain over is the non deterministic
> aspect of it.

People also complained about messaging and the non-linear aspect of it
when they moved from DOS to Windows. I guess, to many, this is a
similar issue, i.e. they sense a loss of control. <g>


--
Rudy Velthuis http://rvelthuis.de

"We don't make mistakes, we just have happy little accidents."
-- Bob Ross, "The Joy of Painting"
 
 
Jon Harrop





PostPosted: 2008-6-9 21:17:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Jon Skeet [C# MVP] wrote:
> ...
> - so a fast program can be effectively the same as a slow
> program - but it can't be in terms of completing the run. Options where
> the program fails to run to completion are *not* "effectively" the same
> as options where the program runs fine.

There is an implicit benchmarking methodology behind your statements that is
flawed.

All benchmarks must impose a flow of information:

variable inputs -> constant program -> outputs

So "n" is the *only* variable input to our static program in this case. The
parameters passed to the GC are constants of the program.

You must not restrict consideration to a single constant input value "n"
because that permits partial specialization of the program and any
benchmark can then be arbitrarily optimized, ultimately by simply spitting
out the known correct output for the only input it will ever receive.

Moreover, the constants of the program (including the parameters to -Xms
and -Xmx) cannot be functions of the input. Specifically, they cannot be
optimized for a specific input at the cost of correctness on other inputs.
Such a program cannot reasonably be considered valid.

>> There were two sides to my original point. Firstly, if you manually tweak
>> the GC parameters by hand for one specific input on one specific machine
>> then you are doing manual memory management.
>
> No - you're changing the configuration options to let the automatic
> memory management work more effectively.

Razii was fiddling by hand (manually) with bits and bytes (memory) to make
the current JVM's representation of some data structures for a given input
happen to just fit into a heap of a certain size (management).

> Note how I did exactly the same to make the .NET GC use a different
> implementation - and again, it sped things up.

Firstly, your tweak did not break the program for other (previously valid)
inputs as Razii's does. Secondly, your tweak showed a uniformly significant
performance improvement on a variety of machines but Razii's does not.

> There's a long, long way from that to fully manual memory management
> though.

Measuring some memory requirements by hand in order to preallocate a fixed
size heap is an ancient hallmark of manual memory management dating back to
Fortran programs. That is exactly what Razii did.

> Let me make this absolutely crystal clear, so that so long as you quote
> from this sentence down in your reply, everything else above is
> irrelevant:
>
> Given a run of the code with these options:
>
> $ time java -server -Xms512m -Xmx512m -XX:NewRatio=1 binarytrees 20

Last time you wrote only:

time java -server -Xms512m -Xmx512m -XX:NewRatio=1

which I agreed with.

> You claimed that Razii effectively turned the GC off. I claim that he
> certainly didn't, because with the GC having no effect the program
> would not have completed.
>
> Do you disagree with my claim that with the GC *actually* turned off
> (if there were some way to do that)

You can turn the GC off by cranking up the heap size, which is exactly what
Razii was doing.

> the program would have failed to finish,

All of these programs fail for sufficiently large "n" because they run out
of space.

> or do you think that a (hypothetical) set of options where a
> program fails to finish can be *effectively* the same as a set of
> options where the program manages to run?

They are effectively the same because we never had any assurance that the GC
was going to collect aggressively enough anyway.

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u
 
 
Jon Harrop





PostPosted: 2008-6-9 21:31:00 Top

java-programmer >> benchmarks? java vs .net (binarytrees) Arne Vajh鴍 wrote:
> Razii wrote:
>> On Sat, 07 Jun 2008 19:35:08 +0100, Jon Harrop <email***@***.com>
>> wrote:
>>>> The shootout doesn't use 3 gig max memory.
>>> How do you know that?
>>
>> They list the options they use. Besides, the computer they are using,
>> Pentium 4, has only 512 MB ram.
>
> Xmx is a reliable max on memory usage.

The Java program with the 512Mb limit actually uses 800Mb here.

--
Dr Jon D Harrop, Flying Frog Consultancy
http://www.ffconsultancy.com/products/?u