OutOfMemoryException With XML DOM  
Author Message
Jason Cavett





PostPosted: 2008-2-21 9:20:00 Top

java-programmer, OutOfMemoryException With XML DOM My project uses XML for its data files and I am using a DOM parser
(the one native to the JDK) to parse out the files. DOM is especially
useful because the project lends itself to the use of trees.

Unfortunately, there tends to be a limit as to how big the XML files
can be before the DOM parser starts chewing up memory and, if the file
is big enough, I get an OutOfMemoryException. It's not from the
project specifically - it's instead a result of the enormous amount of
space DOM takes up.

I was wondering if there's a solution to this? I have read about SAX
a bit, and although it would fix the OOMEx. it would make it more
difficult to manage the tree structure. I could also increase the
amount of RAM available to the JRE, but I'd rather do that as a last
resort.

Does anybody have any other suggestions? Thanks.
 
Arne Vajh鴍





PostPosted: 2008-2-21 10:07:00 Top

java-programmer >> OutOfMemoryException With XML DOM Jason Cavett wrote:
> My project uses XML for its data files and I am using a DOM parser
> (the one native to the JDK) to parse out the files. DOM is especially
> useful because the project lends itself to the use of trees.
>
> Unfortunately, there tends to be a limit as to how big the XML files
> can be before the DOM parser starts chewing up memory and, if the file
> is big enough, I get an OutOfMemoryException. It's not from the
> project specifically - it's instead a result of the enormous amount of
> space DOM takes up.
>
> I was wondering if there's a solution to this? I have read about SAX
> a bit, and although it would fix the OOMEx. it would make it more
> difficult to manage the tree structure. I could also increase the
> amount of RAM available to the JRE, but I'd rather do that as a last
> resort.
>
> Does anybody have any other suggestions?

No.

-Xmx seems as the best way to go.

Arne
 
Jason Cavett





PostPosted: 2008-2-21 23:54:00 Top

java-programmer >> OutOfMemoryException With XML DOM On Feb 20, 9:07 pm, Arne Vajh鴍 <email***@***.com> wrote:
> Jason Cavett wrote:
> > My project uses XML for its data files and I am using a DOM parser
> > (the one native to the JDK) to parse out the files. DOM is especially
> > useful because the project lends itself to the use of trees.
>
> > Unfortunately, there tends to be a limit as to how big the XML files
> > can be before the DOM parser starts chewing up memory and, if the file
> > is big enough, I get an OutOfMemoryException. It's not from the
> > project specifically - it's instead a result of the enormous amount of
> > space DOM takes up.
>
> > I was wondering if there's a solution to this? I have read about SAX
> > a bit, and although it would fix the OOMEx. it would make it more
> > difficult to manage the tree structure. I could also increase the
> > amount of RAM available to the JRE, but I'd rather do that as a last
> > resort.
>
> > Does anybody have any other suggestions?
>
> No.
>
> -Xmx seems as the best way to go.
>
> Arne

Haha. Alright. I was sort of hoping that wasn't the solution, but if
that's what has to be done, that's what I'll do.

Thanks.
 
 
Stanimir Stamenkov





PostPosted: 2008-2-22 7:03:00 Top

java-programmer >> OutOfMemoryException With XML DOM Wed, 20 Feb 2008 17:19:32 -0800 (PST), /Jason Cavett/:

> My project uses XML for its data files and I am using a DOM parser
> (the one native to the JDK) to parse out the files. DOM is especially
> useful because the project lends itself to the use of trees.
>
> Unfortunately, there tends to be a limit as to how big the XML files
> can be before the DOM parser starts chewing up memory and, if the file
> is big enough, I get an OutOfMemoryException. It's not from the
> project specifically - it's instead a result of the enormous amount of
> space DOM takes up.
>
> I was wondering if there's a solution to this? I have read about SAX
> a bit, and although it would fix the OOMEx. it would make it more
> difficult to manage the tree structure. I could also increase the
> amount of RAM available to the JRE, but I'd rather do that as a last
> resort.
>
> Does anybody have any other suggestions? Thanks.

Great deal of the DOM is usually taken by whitespace in element
content (used only to format the source XML text). Depending on the
parser implementation you could supply a DTD to make the parser
ignore [1] the whitespace in element content, or use custom
filtering [2] as provided by the DOM Level 3 Load and Save APIs and
implementation part of the standard Java 1.5 framework.

The Xerces2 implementation (modified version of which is part of the
Sun's Java 1.5 distribution) is capable of ignoring whitespace in
element content when a suitable DTD is provided even in
non-validating mode. One could supply a DTD for documents which
don't have a DOCTYPE declaration setting an EntityResolver2 [3] (see
the getExternalSubset() method) instance to the DocumentBuilder [4].

All the above stuff is also available to Java 1.4 users simply by
plugging the latest Xerces2 jars into the classpath.

[1]
<http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setIgnoringElementContentWhitespace(boolean)>
[2]
<http://java.sun.com/j2se/1.5.0/docs/api/org/w3c/dom/ls/LSParserFilter.html>
[3]
<http://java.sun.com/j2se/1.5.0/docs/api/org/xml/sax/ext/EntityResolver2.html>
[4]
<http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilder.html#setEntityResolver(org.xml.sax.EntityResolver)>

--
Stanimir