|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.xml.sax.helpers.DefaultHandler
edu.northwestern.at.utils.xml.TEITextExtractorHandler
public class TEITextExtractorHandler
SAX event handler to extract text from a TEI XML file.
Only the text between <text> and </text> tags is extracted. No effort is made to capture any of the original text division marked by the XML tags.
| Field Summary | |
|---|---|
protected java.lang.StringBuffer |
extractedText
Holds the extracted text. |
protected static boolean |
inText
Track if we're in |
| Constructor Summary | |
|---|---|
TEITextExtractorHandler()
Create text extractor handler. |
|
| Method Summary | |
|---|---|
void |
characters(char[] ch,
int start,
int length)
Handle character data. |
void |
endElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName)
Handle end of an element. |
java.lang.String |
getExtractedText()
Return extracted text. |
void |
startElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName,
org.xml.sax.Attributes atts)
Handle start of an XML element. |
| Methods inherited from class org.xml.sax.helpers.DefaultHandler |
|---|
endDocument, endPrefixMapping, error, fatalError, ignorableWhitespace, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping, unparsedEntityDecl, warning |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected java.lang.StringBuffer extractedText
protected static boolean inText
| Constructor Detail |
|---|
public TEITextExtractorHandler()
| Method Detail |
|---|
public void startElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName,
org.xml.sax.Attributes atts)
throws org.xml.sax.SAXException
startElement in interface org.xml.sax.ContentHandlerstartElement in class org.xml.sax.helpers.DefaultHandleruri - The XML element's URI.localName - The XML element's local name.qName - The XML element's qname.atts - The XML element's attributes.
org.xml.sax.SAXException
public void endElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName)
throws org.xml.sax.SAXException
endElement in interface org.xml.sax.ContentHandlerendElement in class org.xml.sax.helpers.DefaultHandleruri - The XML element's URI.localName - The XML element's local name.qName - The XML element's qname.
org.xml.sax.SAXException
public void characters(char[] ch,
int start,
int length)
throws org.xml.sax.SAXException
characters in interface org.xml.sax.ContentHandlercharacters in class org.xml.sax.helpers.DefaultHandlerch - Array of characters.start - The starting position in the array.length - The number of characters.
org.xml.sax.SAXException - If there is an error.public java.lang.String getExtractedText()
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||