public class TEITextExtractorHandler
extends org.xml.sax.helpers.DefaultHandler
Only the text between <text> and </text> tags is extracted. No effort is made to capture any of the original text division marked by the XML tags.
| Modifier and Type | Field and Description |
|---|---|
protected java.lang.StringBuffer |
extractedText
Holds the extracted text.
|
protected static boolean |
inText
Track if we're in
|
| Constructor and Description |
|---|
TEITextExtractorHandler()
Create text extractor handler.
|
| Modifier and Type | Method and Description |
|---|---|
void |
characters(char[] ch,
int start,
int length)
Handle character data.
|
void |
endElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName)
Handle end of an element.
|
java.lang.String |
getExtractedText()
Return extracted text.
|
void |
startElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName,
org.xml.sax.Attributes atts)
Handle start of an XML element.
|
protected java.lang.StringBuffer extractedText
protected static boolean inText
public TEITextExtractorHandler()
public void startElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName,
org.xml.sax.Attributes atts)
throws org.xml.sax.SAXException
startElement in interface org.xml.sax.ContentHandlerstartElement in class org.xml.sax.helpers.DefaultHandleruri - The XML element's URI.localName - The XML element's local name.qName - The XML element's qname.atts - The XML element's attributes.org.xml.sax.SAXExceptionpublic void endElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName)
throws org.xml.sax.SAXException
endElement in interface org.xml.sax.ContentHandlerendElement in class org.xml.sax.helpers.DefaultHandleruri - The XML element's URI.localName - The XML element's local name.qName - The XML element's qname.org.xml.sax.SAXExceptionpublic void characters(char[] ch,
int start,
int length)
throws org.xml.sax.SAXException
characters in interface org.xml.sax.ContentHandlercharacters in class org.xml.sax.helpers.DefaultHandlerch - Array of characters.start - The starting position in the array.length - The number of characters.org.xml.sax.SAXException - If there is an error.public java.lang.String getExtractedText()