Getting HTML title using HTMLEditorKit.ParserCallback  
Author Message
Bill Tschumy





PostPosted: 2004-4-22 21:58:00 Top

java-programmer, Getting HTML title using HTMLEditorKit.ParserCallback I am parsing an HTML file using ParseDelegator and a ParserCallback. I am
trying to get the document title and the HREF links. The ParserCallback is
successfully getting the HREF, so I know it is basically working. However,
when I try to get the title, I always get back null. Here is the relevant
code of the ParserCallback subclass. Anyone have any clue as to what I'm
doing wrong?

public void handleStartTag(HTML.Tag tag,
MutableAttributeSet attrSet, int pos)
{
if (tag == HTML.Tag.TITLE)
{
urlTitle = (String)attrSet.getAttribute(HTML.Attribute.TITLE);
System.out.println("attrSet: " + attrSet); // prints ""
System.out.println("found title: " + urlTitle); // prints null
}
if (tag == HTML.Tag.A)
{
// This successfully gets the target URL
String targetURLStr =
(String)attrSet.getAttribute(HTML.Attribute.HREF);
}

}

--
Bill Tschumy
Otherwise -- Austin, TX
http://www.otherwise.com

 
mromarkhan





PostPosted: 2004-4-23 5:29:00 Top

java-programmer >> Getting HTML title using HTMLEditorKit.ParserCallback Peace be unto you.

It might work if the test case was
<title title="Fun with markers">Crayon</title>

"Unlike the TITLE element,
which provides information
about an entire document
and may only appear once,
the title attribute may
annotate any number of elements.:"
- http://www.w3.org/TR/html401/struct/global.html#h-7.4.2

Have a good day.