My app runs too slow (lots of code follows)  
Author Message
sp00fd





PostPosted: 2003-12-11 23:17:00 Top

java-programmer, My app runs too slow (lots of code follows) Let me start out by saying that I'm admittedly not a great programmer.
I'm sure that my problems are my own fault and not really Java's. I
have a program that I had originally written in Perl. For various
reasons I decided to rewrite it in Java, but now that I'm nearly done,
I'm not sure if I want to use it because it's very slow and resource
intensive compared to my Perl version.

Quick summary of what the program does: it performs MD5 sums on a
number of files, stores the sums and compares against them on future
runs -- sort of a cheesy tripwire, checks that some processes are
running, checks the contents of a few small files like /etc/passwd,
/etc/shadow, etc. It's strictly a Unix program for me.

My old perl version generally runs the comparisons, checks file
permissions, reads a couple of files, along with reading the last few
minutes of a few logfiles (something I have yet to really implement in
my new Java version, which will make it run even longer, of course) in
less than a second. My java version does it (minus the logfile
reading) in anywhere between 10 and 15 seconds.

Here's my -Xprof output:

Flat profile of 0.74 secs (37 total ticks): process reaper

Thread-local ticks:
100.0% 37 Unknown: no last frame


Flat profile of 0.69 secs (36 total ticks): process reaper

Thread-local ticks:
100.0% 36 Unknown: no last frame


Flat profile of 0.47 secs (24 total ticks): process reaper

Thread-local ticks:
100.0% 24 Unknown: no last frame


Flat profile of 0.79 secs (41 total ticks): process reaper

Thread-local ticks:
100.0% 41 Unknown: no last frame


Flat profile of 10.98 secs (324 total ticks): main

Interpreted + native Method
1.2% 0 + 2 java.io.UnixFileSystem.getBooleanAttributes0
1.2% 0 + 2 java.lang.System.arraycopy
0.6% 1 + 0 java.io.File.getName
0.6% 0 + 1 ssm.FileStat.getOwner
0.6% 0 + 1 java.util.ResourceBundle.<clinit>
0.6% 1 + 0 java.io.BufferedInputStream.fill
0.6% 1 + 0 sun.nio.cs.US_ASCII$Decoder.decodeArrayLoop
0.6% 0 + 1 java.io.FileInputStream.open
5.8% 3 + 7 Total interpreted

Compiled + native Method
34.7% 59 + 1 java.io.ByteArrayOutputStream.write
27.2% 47 + 0 java.io.BufferedInputStream.read
18.5% 32 + 0 sun.security.provider.MD5.transform
1.2% 2 + 0 java.util.Arrays.mergeSort
1.2% 2 + 0 ssm.MD5Sum.loadByteData
0.6% 1 + 0 java.lang.StringBuffer.length
0.6% 1 + 0 ssm.UnixCrypt.D_ENCRYPT
0.6% 1 + 0 java.lang.String.toString
84.4% 145 + 1 Total compiled

Stub + native Method
0.6% 0 + 1 java.lang.System.arraycopy
0.6% 0 + 1 java.io.FileInputStream.readBytes
1.2% 0 + 2 Total stub

Thread-local ticks:
46.6% 151 Blocked (of total)
0.6% 1 Class loader
7.5% 13 Unknown: no last frame
0.6% 1 Unknown: thread_state


Flat profile of 0.00 secs (1 total ticks): Thread-1

Interpreted + native Method
100.0% 1 + 0
java.security.AccessController.getStackAccessControlContext
100.0% 1 + 0 Total interpreted


Global summary of 11.01 seconds:
100.0% 327 Received ticks
0.3% 1 Received GC ticks
0.6% 2 Compilation
0.3% 1 Class loader
46.5% 152 Unknown code


And, because I'm not really sure what to point anyone at, I'll just
attach
the bulk of my code, sorry...

main:=========================================================
import ssm.*;
import java.io.*;
import java.util.*;
import java.net.InetAddress;
import java.text.SimpleDateFormat;

class SystemsSecurityMonitor {
 
Michael Borgwardt





PostPosted: 2003-12-11 23:36:00 Top

java-programmer >> My app runs too slow (lots of code follows) Mikey wrote:
> My old perl version generally runs the comparisons, checks file
> permissions, reads a couple of files, along with reading the last few
> minutes of a few logfiles (something I have yet to really implement in
> my new Java version, which will make it run even longer, of course) in
> less than a second. My java version does it (minus the logfile
> reading) in anywhere between 10 and 15 seconds.
>
> Here's my -Xprof output:
[]
> Flat profile of 10.98 secs (324 total ticks): main
[]
> 34.7% 59 + 1 java.io.ByteArrayOutputStream.write
> 27.2% 47 + 0 java.io.BufferedInputStream.read

Well, it's pretty obvious where the cycles are being burned, isn't it.

Here's the culprit:

> while((ch = bis.read()) != -1) {
> baos.write(ch);
> }

You pump data from the input stream into the byte array one byte at a
time - horrendously inefficient. Furthermore, the byte array only exists
to be passed to the MessageDigest, a copy step which is completely
superfluous.

use the read() method that takes a buffer byte array as an argument,
pay attention to the number of bytes read (the return value) and pass
the filled buffer portion directly to the MessageDigest. Oh, and
do this with the inputStream directly, without wrapping it into a
BufferedInputStream, since you already use a buffer.

This saves you billions of method invocations and two times
copying your entire data. The resulting code may well be faster than
the perl code.

 
John C. Bollinger





PostPosted: 2003-12-12 4:55:00 Top

java-programmer >> My app runs too slow (lots of code follows) Mikey wrote:

> Let me start out by saying that I'm admittedly not a great programmer.
> I'm sure that my problems are my own fault and not really Java's. I
> have a program that I had originally written in Perl. For various
> reasons I decided to rewrite it in Java, but now that I'm nearly done,
> I'm not sure if I want to use it because it's very slow and resource
> intensive compared to my Perl version.
>
> Quick summary of what the program does: it performs MD5 sums on a
> number of files, stores the sums and compares against them on future
> runs -- sort of a cheesy tripwire, checks that some processes are
> running, checks the contents of a few small files like /etc/passwd,
> /etc/shadow, etc. It's strictly a Unix program for me.
>
> My old perl version generally runs the comparisons, checks file
> permissions, reads a couple of files, along with reading the last few
> minutes of a few logfiles (something I have yet to really implement in
> my new Java version, which will make it run even longer, of course) in
> less than a second. My java version does it (minus the logfile
> reading) in anywhere between 10 and 15 seconds.

Michael Borgwardt offered a well targetted suggestion for improving your
code's performance. You should also be aware, however, that any Java
program

(a) suffers a one-time performance penalty due to JVM startup. This can
be negligible or a few seconds, depending on many factors.

(b) speeds up as it runs, because of just-in-time compilation.

(c) has a memory footprint that includes the JVM's, typically several Mb.

This means that Java is not an ideal choice for an app that must perform
a brief task, once, as quickly as possible, nor for an app that must
have a minimal memory footprint.

Of course, Perl scripts suffer from Perl's load time and memory
footprint unless they are compiled down to machine code -- but in that
case they aren't really Perl any more.


John Bollinger
email***@***.com

 
 
sp00fd





PostPosted: 2003-12-12 6:03:00 Top

java-programmer >> My app runs too slow (lots of code follows) Michael Borgwardt <email***@***.com> wrote in message news:<bra2u1$118g1$email***@***.com>...
>
> use the read() method that takes a buffer byte array as an argument,
> pay attention to the number of bytes read (the return value) and pass
> the filled buffer portion directly to the MessageDigest. Oh, and
> do this with the inputStream directly, without wrapping it into a
> BufferedInputStream, since you already use a buffer.
>
> This saves you billions of method invocations and two times
> copying your entire data. The resulting code may well be faster than
> the perl code.


Thanks very much for the response and the help. I'm pretty new, so I
just wanted to make sure that the code below is what you had in mind.
Incidentally, this code definitely drops me down to a 6-8 second
execution vs. 10-15. Thanks a bunch! The -Xprof output seems to show
a lot of MD5.transform execution now, but I suppose that's to be
expected, right?


Flat profile of 0.72 secs (36 total ticks): process reaper

Thread-local ticks:
100.0% 36 Unknown: no last frame


Flat profile of 0.49 secs (22 total ticks): process reaper

Thread-local ticks:
100.0% 22 Unknown: no last frame


Flat profile of 0.83 secs (41 total ticks): process reaper

Thread-local ticks:
100.0% 41 Unknown: no last frame


Flat profile of 1.13 secs (56 total ticks): process reaper

Thread-local ticks:
100.0% 56 Unknown: no last frame


Flat profile of 6.69 secs (246 total ticks): main

Interpreted + native Method
5.6% 0 + 4 java.io.UnixFileSystem.getBooleanAttributes0
1.4% 1 + 0 ssm.UnixCrypt.des_set_key
1.4% 0 + 1 ssm.HashManager.setHashProperties
1.4% 1 + 0 java.lang.String.<init>
1.4% 0 + 1
java.text.AttributedCharacterIterator$Attribute.<init>
1.4% 1 + 0 java.nio.ByteBuffer.<init>
1.4% 1 + 0 java.nio.HeapCharBuffer.<init>
1.4% 0 + 1 ssm.FileStat.getGroup
15.3% 4 + 7 Total interpreted

Compiled + native Method
50.0% 36 + 0 sun.security.provider.MD5.transform
2.8% 2 + 0 java.util.regex.Pattern$Single.match
1.4% 0 + 1
java.util.jar.Manifest$FastInputStream.readLine
1.4% 1 + 0 sun.security.provider.MD5.engineUpdate
1.4% 1 + 0 vtable chunks
56.9% 40 + 1 Total compiled

Stub + native Method
1.4% 0 + 1 java.io.FileInputStream.readBytes
1.4% 0 + 1 Total stub

Thread-local ticks:
70.7% 174 Blocked (of total)
8.3% 6 Class loader
4.2% 3 Compilation
13.9% 10 Unknown: no last frame


Global summary of 6.74 seconds:
100.0% 251 Received ticks
0.4% 1 Received GC ticks
0.8% 2 Compilation
0.4% 1 Other VM operations
2.4% 6 Class loader
65.7% 165 Unknown code

md5sum:========================================================
package ssm;

import java.io.*;
import java.security.*;

public class MD5Sum {
MessageDigest md5 = null;
FileInputStream in = null;

private String hexDigit(byte x) {
StringBuffer sb = new StringBuffer();
char c;

c = (char)((x >> 4) & 0x0f);
if(c > 9) {
c = (char)((c - 10) + 'a');
}
else {
c = (char)(c + '0');
}

sb.append(c);

c = (char)(x & 0xf);
if(c > 9) {
c = (char)((c - 10) + 'a');
}
else {
c = (char)(c + '0');
}

sb.append(c);
return sb.toString();
}

public String hash_file(String filename) throws IOException {
byte[] buf = new byte[4096];
int len = 0;

try {
in = new FileInputStream(new File(filename));
md5 = MessageDigest.getInstance("MD5");
while((len = in.read(buf)) >= 0) {
md5.update(buf, 0, len);
}
in.close();
}
catch (NoSuchAlgorithmException nsae) {
nsae.printStackTrace();
}
catch (IOException ioe) {
System.err.println(ioe.toString());
}

if(len == -1) { // we must've read something
byte[] digest = md5.digest();
StringBuffer hexString = new StringBuffer();
int digestLength = digest.length;

for(int i=0; i < digest.length; i++) {
hexString.append(hexDigit(digest[i]));
}

return hexString.toString();
}
else {
return null;
}
}
}
 
 
Michael Borgwardt





PostPosted: 2003-12-12 6:47:00 Top

java-programmer >> My app runs too slow (lots of code follows) Mikey wrote:
> Thanks very much for the response and the help. I'm pretty new, so I
> just wanted to make sure that the code below is what you had in mind.

Yes, that's it.

> Incidentally, this code definitely drops me down to a 6-8 second
> execution vs. 10-15. Thanks a bunch! The -Xprof output seems to show
> a lot of MD5.transform execution now, but I suppose that's to be
> expected, right?

Yes. It may be that sun's MD5 implementation is not the fastest and
the factors John mentioned definitely also play their part.

 
 
Michael Borgwardt





PostPosted: 2003-12-12 16:36:00 Top

java-programmer >> My app runs too slow (lots of code follows) Mikey wrote:

> Thread-local ticks:
> 70.7% 174 Blocked (of total)

BTW, the above seems to indicate that the app spends a lot of time blocked on IO.
This might improve if the buffer size below is increased:

> byte[] buf = new byte[4096];

 
 
sp00fd





PostPosted: 2003-12-13 4:00:00 Top

java-programmer >> My app runs too slow (lots of code follows) Michael Borgwardt <email***@***.com> wrote in message news:<brbukq$1hh19$email***@***.com>...
> Mikey wrote:
>
> > Thread-local ticks:
> > 70.7% 174 Blocked (of total)
>
> BTW, the above seems to indicate that the app spends a lot of time blocked on IO.
> This might improve if the buffer size below is increased:
>
> > byte[] buf = new byte[4096];

I tried it at a few different sizes (8192, 16384), but I really didn't
see much affect. In fact, while not exactly scientific, it looked
like my execution times may have gone up in some cases (although it's
possible that something else on my system could've slowed thigs down).
A few samples of -Xprof output seemed to show that the blocked I/O
remained around 70%. Could another class be causing that?

On a side note, thanks so much for all of your help. You've been
great.
 
 
sp00fd





PostPosted: 2003-12-13 4:54:00 Top

java-programmer >> My app runs too slow (lots of code follows) Michael Borgwardt <email***@***.com> wrote in message news:<brbukq$1hh19$email***@***.com>...
> Mikey wrote:
>
> > Thread-local ticks:
> > 70.7% 174 Blocked (of total)
>
> BTW, the above seems to indicate that the app spends a lot of time blocked on IO.
> This might improve if the buffer size below is increased:
>
> > byte[] buf = new byte[4096];

scratch my last post. It does seem to be down around 5 seconds
consistently now. Sweet! Still though, I have the 70% blocked. How
can I tell where that's coming from?

# timex /usr/local/scripts/ssm/runssm

Flat profile of 5.11 secs (153 total ticks): main

Interpreted + native Method
2.2% 0 + 1 java.io.FileInputStream.readBytes
2.2% 0 + 1 SystemsSecurityMonitor.<clinit>
2.2% 1 + 0 java.lang.Integer.toString
2.2% 0 + 1 sun.text.resources.LocaleElements.getContents
2.2% 1 + 0 java.util.Properties.load
2.2% 0 + 1 java.io.UnixFileSystem.getBooleanAttributes0
13.3% 2 + 4 Total interpreted

Compiled + native Method
53.3% 24 + 0 sun.security.provider.MD5.transform
2.2% 1 + 0 java.util.regex.Matcher.find
2.2% 1 + 0 sun.security.provider.MD5.engineUpdate
57.8% 26 + 0 Total compiled

Thread-local ticks:
70.6% 108 Blocked (of total)
2.2% 1 Class loader
6.7% 3 Compilation
20.0% 9 Unknown: no last frame


Global summary of 5.12 seconds:
100.0% 156 Received ticks
1.3% 2 Compilation
0.6% 1 Other VM operations
0.6% 1 Class loader
5.8% 9 Unknown code

real 5.87
user 2.53
sys 0.85