Groovy Documentation

org.jdesktop.dom
[Java] Class SimpleHtmlDocumentBuilder

java.lang.Object
  javax.xml.parsers.DocumentBuilder
      org.jdesktop.dom.SimpleHtmlDocumentBuilder

public class SimpleHtmlDocumentBuilder
extends DocumentBuilder

An HTML DOM DocumentBuilder implementation that does not require the factory pattern for creation. Most of the time calling one of the static simpleParse methods is all that is required.

This implementation requires a normal DOM parser. It is not suitable for parsing arbitrary HTML documents, even those documents which conform to the various HTML specifications. Rather, it requires a preproccesor to first clean up the HTML such that it can be parsed into a DOM.

Authors:
rbair


Field Summary
private static SimpleHtmlDocumentBuilder INSTANCE

private SAXParserFactory factory

 
Constructor Summary
SimpleHtmlDocumentBuilder()

Create a new SimpleHtmlDocumentBuilder.

 
Method Summary
org.w3c.dom.DOMImplementation getDOMImplementation()

@inheritDoc

private static SimpleHtmlDocumentBuilder getInstance()

Schema getSchema()

@inheritDoc

boolean isNamespaceAware()

@inheritDoc

boolean isValidating()

@inheritDoc

boolean isXIncludeAware()

SimpleHtmlDocument newDocument()

@inheritDoc

org.w3c.dom.html.HTMLDocument newPlainDocument()

Returns:
an unenclosed Document.

SimpleHtmlDocument parse(InputSource is)

@inheritDoc

SimpleHtmlDocument parse(InputStream is)

@inheritDoc

SimpleHtmlDocument parse(InputStream is, String systemId)

@inheritDoc

SimpleHtmlDocument parse(String uri)

@inheritDoc

SimpleHtmlDocument parse(File f)

@inheritDoc

SimpleHtmlDocument parseString(String html)

void reset()

@inheritDoc

void setEntityResolver(EntityResolver er)

@inheritDoc

void setErrorHandler(ErrorHandler eh)

@inheritDoc

static SimpleHtmlDocument simpleParse(InputSource is)

static SimpleHtmlDocument simpleParse(InputStream in)

static SimpleHtmlDocument simpleParse(URL url)

static SimpleHtmlDocument simpleParse(String xml)

 
Methods inherited from class DocumentBuilder
reset, parse, parse, parse, parse, parse, setErrorHandler, setEntityResolver, isNamespaceAware, isValidating, newDocument, getDOMImplementation, getSchema, isXIncludeAware, wait, wait, wait, equals, toString, hashCode, getClass, notify, notifyAll
 
Methods inherited from class Object
wait, wait, wait, equals, toString, hashCode, getClass, notify, notifyAll
 

Field Detail

INSTANCE

private static SimpleHtmlDocumentBuilder INSTANCE


factory

private SAXParserFactory factory


 
Constructor Detail

SimpleHtmlDocumentBuilder

public SimpleHtmlDocumentBuilder()
Create a new SimpleHtmlDocumentBuilder. SimpleHtmlDocumentBuilder will delegate parsing to the default DocumentBuilder constructed via the default DocumentBuilderFactory.


 
Method Detail

getDOMImplementation

public org.w3c.dom.DOMImplementation getDOMImplementation()
inheritDoc:


getInstance

private static SimpleHtmlDocumentBuilder getInstance()


getSchema

public Schema getSchema()
inheritDoc:


isNamespaceAware

public boolean isNamespaceAware()
inheritDoc:


isValidating

public boolean isValidating()
inheritDoc:


isXIncludeAware

public boolean isXIncludeAware()


newDocument

public SimpleHtmlDocument newDocument()
inheritDoc:


newPlainDocument

org.w3c.dom.html.HTMLDocument newPlainDocument()
Returns:
an unenclosed Document. This is used only by the SimpleDocument no arg constructor


parse

public SimpleHtmlDocument parse(InputSource is)
inheritDoc:


parse

public SimpleHtmlDocument parse(InputStream is)
inheritDoc:


parse

public SimpleHtmlDocument parse(InputStream is, String systemId)
inheritDoc:


parse

public SimpleHtmlDocument parse(String uri)
inheritDoc:


parse

public SimpleHtmlDocument parse(File f)
inheritDoc:


parseString

public SimpleHtmlDocument parseString(String html)

Parse the content of the given String as an XML document and return a new HTML DOM SimpleHtmlDocument object. An IllegalArgumentException is thrown if the String is null.

NOTE: this implementation requires a normal DOM parser. It is not suitable for parsing arbitrary HTML documents, even those documents which conform to the various HTML specifications. Rather, it requires a preproccesor to first clean up the HTML such that it can be parsed into a DOM.

throws:
IOException If any IO errors occur.
throws:
SAXException If any parse errors occur.
throws:
IllegalArgumentException When html is null
Parameters:
html - String containing the content to be parsed. Must be valid XHTML
Returns:
SimpleHtmlDocument result of parsing the String
See Also:
DocumentHandler


reset

public void reset()
inheritDoc:


setEntityResolver

public void setEntityResolver(EntityResolver er)
inheritDoc:


setErrorHandler

public void setErrorHandler(ErrorHandler eh)
inheritDoc:


simpleParse

public static SimpleHtmlDocument simpleParse(InputSource is)

Parse the content of the given input source as an XML document and return a new HTML DOM SimpleDocument object. An IllegalArgumentException is thrown if the InputSource is null null.

NOTE: this implementation requires a normal DOM parser. It is not suitable for parsing arbitrary HTML documents, even those documents which conform to the various HTML specifications. Rather, it requires a preproccesor to first clean up the HTML such that it can be parsed into a DOM.

throws:
IOException If any IO errors occur.
throws:
SAXException If any parse errors occur.
throws:
IllegalArgumentException When is is null
Parameters:
is - InputSource containing the content to be parsed.
Returns:
A new DOM SimpleHtmlDocument object.
See Also:
DocumentHandler


simpleParse

public static SimpleHtmlDocument simpleParse(InputStream in)


simpleParse

public static SimpleHtmlDocument simpleParse(URL url)


simpleParse

public static SimpleHtmlDocument simpleParse(String xml)

Parse the content of the given String as an XML document and return a new HTML DOM SimpleHtmlDocument object. An IllegalArgumentException is thrown if the String is null.

NOTE: this implementation requires a normal DOM parser. It is not suitable for parsing arbitrary HTML documents, even those documents which conform to the various HTML specifications. Rather, it requires a preproccesor to first clean up the HTML such that it can be parsed into a DOM.

throws:
IOException If any IO errors occur.
throws:
SAXException If any parse errors occur.
throws:
IllegalArgumentException When xml is null
Parameters:
xml - String containing the content to be parsed.
Returns:
SimpleDocument result of parsing the String
See Also:
DocumentHandler


 

Groovy Documentation