edu.northwestern.at.morphadorner.tools
Class ExtendedAdornedWord

java.lang.Object
  extended by edu.northwestern.at.utils.corpuslinguistics.adornedword.BaseAdornedWord
      extended by edu.northwestern.at.morphadorner.tools.ExtendedAdornedWord
All Implemented Interfaces:
AdornedWord, HasID, XCloneable, java.io.Externalizable, java.io.Serializable, java.lang.Cloneable, java.lang.Comparable

public class ExtendedAdornedWord
extends BaseAdornedWord
implements AdornedWord, HasID, java.io.Externalizable

Information about a single XML word element.

An ExtendedAdornedWord object extends the BaseAdornedWord object with the following additional information about a single word spelling externalized as an XML "" element.

  1. The permanent word ID.
  2. The end of sentence flag (1 if word ends a sentence, 0 otherwise)
  3. The partial original token, if this is part of a word.
  4. The part flag. "N" for a word which is not split; "I" for the first part of a split word; "M" for the middle parts of a split word; and "F" for the final part of a split word.
  5. The word ordinal within the text (starts at 1)
  6. FrontMiddleBack.FRONT, FrontMiddleBack.MIDDLE or FrontMiddleBack.BACK indicating if the word appears in front matter, middle matter (e.g., main text body), or back matter, respectively.
  7. MainSize.MAIN or MainSize.SIDE indicating if the word appears in the main text or side text, respectively.
  8. Nearest ancestor tag which is not a soft tag.
  9. Parent tag.
  10. A flag which indicates if the word appears in spoken text. True if the word is spoken, false otherwise.
  11. A link to the previous full word, in reading sequence.
  12. A link to the next full word, in reading sequence.
  13. A link to the previous word part, in reading sequence, for a split word.
  14. A link to the next word part, in reading sequence, for a split word.
  15. An XML path to the word.
  16. The sentence number to which this word belongs. Sentences start at 1.
  17. The number of the word in the sentence. Starts at 1.
  18. Boolean flag which indicates if word corresponds to gap.
  19. Boolean flag which indicates if word is descendant of a jump tag.
  20. Page number (starting at 0) on which the word appears. Page numbers from from counting elements. Page numbers ignore the "n=" attributes, if any, present on the element.

The following fields are inherited from the BaseAdornedWord object.

  1. The complete original token.
  2. The corrected original spelling.
  3. The standard spelling.
  4. The lemma.
  5. The part of speech.
  6. The token type.

See Also:
Serialized Form

Nested Class Summary
static class ExtendedAdornedWord.FrontMiddleBack
          Front, middle, or back of text.
static class ExtendedAdornedWord.MainSide
          Main or side text (paratext).
 
Field Summary
protected  boolean eos
          End of sentence flag.
protected  ExtendedAdornedWord.FrontMiddleBack frontMiddleBack
          Front/middle/back text marker.
protected  byte[] id
          Word ID.
protected  boolean inJumpTag
          Jump tag flag.
protected  boolean isGap
          True if word corresponds to a , false otherwise.
protected  boolean isSpoken
          Spoken word flag.
protected  boolean isVerse
          Verse flag.
protected  ExtendedAdornedWord.MainSide mainSide
          Main/side text marker.
protected  ExtendedAdornedWord nextWord
          Next word.
protected  ExtendedAdornedWord nextWordPart
          Next word part for this word.
protected  int ord
          Word ordinal.
protected  int pageNumber
          Page number on which word appears.
protected  java.lang.String part
          Word part flag.
protected  byte[] path
          XML word path.
protected  ExtendedAdornedWord previousWord
          Previous word.
protected  ExtendedAdornedWord previousWordPart
          Previous word part for this word.
protected  int sentenceNumber
          Sentence number.
protected static long serialVersionUID
          Serial version UID.
protected  int wordIndex
          Word index in list of words.
protected  int wordNumber
          Word number within sentence.
protected  java.lang.String wordText
          Original, possibly partial, word text.
 
Fields inherited from class edu.northwestern.at.utils.corpuslinguistics.adornedword.BaseAdornedWord
lemmata, partsOfSpeech, spelling, standardSpelling, token, tokenType
 
Constructor Summary
ExtendedAdornedWord()
          Create empty ExtendedAdornedWord object.
ExtendedAdornedWord(AdornedWord adornedWord, java.lang.String id, java.lang.String part, int ord, int pageNumber, boolean eos, int wordNumber, int sentenceNumber, ExtendedAdornedWord.FrontMiddleBack frontMiddleBack, ExtendedAdornedWord.MainSide mainSide, java.lang.String tagPath, boolean isSpoken, boolean isVerse, ExtendedAdornedWord previousWord, ExtendedAdornedWord previousWordPart)
          Create ExtendedAdornedWord object.
ExtendedAdornedWord(java.lang.String wordText, org.xml.sax.Attributes atts, ExtendedAdornedWord.FrontMiddleBack frontMiddleBack, ExtendedAdornedWord.MainSide mainSide, java.lang.String tagPath, int pageNumber, boolean isSpoken, boolean isVerse, boolean inJumpTag, ExtendedAdornedWord previousWord, ExtendedAdornedWord previousWordPart)
          Create ExtendedAdornedWord object.
 
Method Summary
 void appendWordText(char[] ch, int start, int length)
          Append characters to word text.
 boolean getEOS()
          Get end of sentence flag.
 ExtendedAdornedWord.FrontMiddleBack getFrontMiddleBack()
          Get front/middle/back.
 boolean getGap()
          Get gap flag.
 java.lang.String getID()
          Get word ID.
 boolean getInJumpTag()
          Get in jump tag flag.
 ExtendedAdornedWord.MainSide getMainSide()
          Get main or side.
 ExtendedAdornedWord getNextWord()
          Get next word.
 ExtendedAdornedWord getNextWordPart()
          Get next word part.
 int getOrd()
          Get word ordinal.
 int getPageNumber()
          Get page number in which word appears.
 java.lang.String getPart()
          Get part flag.
 java.lang.String getPath()
          Get path.
 ExtendedAdornedWord getPreviousWord()
          Get previous word.
 ExtendedAdornedWord getPreviousWordPart()
          Get previous word part.
 int getSentenceNumber()
          Get sentence number for word.
 boolean getSpoken()
          Get spoken flag.
 boolean getVerse()
          Get verse flag.
 int getWordIndex()
          Get word index.
 int getWordNumber()
          Get word number in sentence.
 java.lang.String getWordText()
          Get word text.
 boolean isFirstPart()
          Check if word is first (or only) part of a split word.
 boolean isLastPart()
          Check if word is last (or only) part of a split word.
 boolean isMiddlePart()
          Check if word is middle (or only) part of a split word.
 boolean isSplitWord()
          Check if word is a split word.
 void readExternal(java.io.ObjectInput in)
          Reads the work set from an object input stream (deserializes the object).
 void setEOS(boolean eos)
          Set end of sentence flag.
 void setGap(boolean isGap)
          Set gap flag.
 void setID(java.lang.String id)
          Set word ID.
 void setInJumpTag(boolean inJumpTag)
          Set in jump tag flag.
 void setNextWord(ExtendedAdornedWord nextWord)
          Set next word.
 void setNextWordPart(ExtendedAdornedWord nextWordPart)
          Set next word part.
 void setOrd(int ord)
          Set word ordinal.
 void setPageNumber(int pageNumber)
          Set page number in which word appears.
 void setPreviousWord(ExtendedAdornedWord previousWord)
          Set previous word.
 void setPreviousWordPart(ExtendedAdornedWord previousWordPart)
          Set previous word part.
 void setSentenceNumber(int sentenceNumber)
          Set sentence number for word.
 void setSpoken(boolean isSpoken)
          Set spoken flag.
 void setVerse(boolean isVerse)
          Set verse flag.
 void setWordIndex(int wordIndex)
          Set word index.
 void setWordNumber(int wordNumber)
          Set word number in sentence.
 java.lang.String toString()
          Return spelling for toString.
 void writeExternal(java.io.ObjectOutput out)
          Writes the work set to an object output stream (serializes the object).
 
Methods inherited from class edu.northwestern.at.utils.corpuslinguistics.adornedword.BaseAdornedWord
clone, compareTo, equals, getLemmata, getPartsOfSpeech, getSpelling, getStandardSpelling, getToken, getTokenType, hashCode, setLemmata, setPartsOfSpeech, setSpelling, setStandardSpelling, setToken, setTokenType
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface edu.northwestern.at.utils.corpuslinguistics.adornedword.AdornedWord
getLemmata, getPartsOfSpeech, getSpelling, getStandardSpelling, getToken, getTokenType, setLemmata, setPartsOfSpeech, setSpelling, setStandardSpelling, setToken, setTokenType
 

Field Detail

id

protected byte[] id
Word ID.


wordText

protected java.lang.String wordText
Original, possibly partial, word text.


part

protected java.lang.String part
Word part flag.

N for an unsplit word. I for the the first part of a split word. M for the middle part(s) of a split word. F for the final part of a split word.


eos

protected boolean eos
End of sentence flag.


ord

protected int ord
Word ordinal.


mainSide

protected ExtendedAdornedWord.MainSide mainSide
Main/side text marker.


isSpoken

protected boolean isSpoken
Spoken word flag. True if word is spoken.


isVerse

protected boolean isVerse
Verse flag. True if word appears in verse as opposed to prose.


inJumpTag

protected boolean inJumpTag
Jump tag flag. True if word appears as descendant of jump tag.


frontMiddleBack

protected ExtendedAdornedWord.FrontMiddleBack frontMiddleBack
Front/middle/back text marker.


previousWord

protected ExtendedAdornedWord previousWord
Previous word.


nextWord

protected ExtendedAdornedWord nextWord
Next word.


previousWordPart

protected ExtendedAdornedWord previousWordPart
Previous word part for this word.


nextWordPart

protected ExtendedAdornedWord nextWordPart
Next word part for this word.


path

protected byte[] path
XML word path.


sentenceNumber

protected int sentenceNumber
Sentence number.


wordNumber

protected int wordNumber
Word number within sentence.


isGap

protected boolean isGap
True if word corresponds to a , false otherwise.


wordIndex

protected int wordIndex
Word index in list of words.


pageNumber

protected int pageNumber
Page number on which word appears.


serialVersionUID

protected static final long serialVersionUID
Serial version UID.

See Also:
Constant Field Values
Constructor Detail

ExtendedAdornedWord

public ExtendedAdornedWord()
Create empty ExtendedAdornedWord object.


ExtendedAdornedWord

public ExtendedAdornedWord(java.lang.String wordText,
                           org.xml.sax.Attributes atts,
                           ExtendedAdornedWord.FrontMiddleBack frontMiddleBack,
                           ExtendedAdornedWord.MainSide mainSide,
                           java.lang.String tagPath,
                           int pageNumber,
                           boolean isSpoken,
                           boolean isVerse,
                           boolean inJumpTag,
                           ExtendedAdornedWord previousWord,
                           ExtendedAdornedWord previousWordPart)
Create ExtendedAdornedWord object.

Parameters:
wordText - tag text.
atts - XML attributes for a single "" element.
frontMiddleBack - Front/middle/back classification.
mainSide - Main/side classification.
tagPath - XML path to this tag.
isSpoken - Is spoken word flag.
isVerse - Is verse flag.
inJumpTag - In jump tag flag.
previousWord - Previous adorned word.
previousWordPart - Previous part for this word.

ExtendedAdornedWord

public ExtendedAdornedWord(AdornedWord adornedWord,
                           java.lang.String id,
                           java.lang.String part,
                           int ord,
                           int pageNumber,
                           boolean eos,
                           int wordNumber,
                           int sentenceNumber,
                           ExtendedAdornedWord.FrontMiddleBack frontMiddleBack,
                           ExtendedAdornedWord.MainSide mainSide,
                           java.lang.String tagPath,
                           boolean isSpoken,
                           boolean isVerse,
                           ExtendedAdornedWord previousWord,
                           ExtendedAdornedWord previousWordPart)
Create ExtendedAdornedWord object.

Parameters:
adornedWord - Populated adorned word.
id - Word ID.
part - String word part flag.
ord - Word ordinal.
pageNumber - Page number.
eos - EOS flag.
wordNumber - Word number in sentence.
sentenceNumber - Sentence number in sentence.
frontMiddleBack - Front/middle/back classification.
mainSide - Main/side classification.
tagPath - XML path to this tag.
isSpoken - Is spoken word flag.
isVerse - Is verse flag.
previousWord - Previous adorned word.
previousWordPart - Previous part for this word.
Method Detail

getID

public java.lang.String getID()
Get word ID.

Specified by:
getID in interface HasID
Returns:
The word ID.

setID

public void setID(java.lang.String id)
Set word ID.

Parameters:
id - The word ID.

getPart

public java.lang.String getPart()
Get part flag.

Returns:
The part flag.

getPath

public java.lang.String getPath()
Get path.

Returns:
The path.

isFirstPart

public boolean isFirstPart()
Check if word is first (or only) part of a split word.

Returns:
True if word is first part of a split word or only part of a non-split word.

isMiddlePart

public boolean isMiddlePart()
Check if word is middle (or only) part of a split word.

Returns:
True if word is middle part of a split word or only part of a non-split word.

isLastPart

public boolean isLastPart()
Check if word is last (or only) part of a split word.

Returns:
True if word is last part of a split word or only part of a non-split word.

isSplitWord

public boolean isSplitWord()
Check if word is a split word.

Returns:
True if word is a split word.

getOrd

public int getOrd()
Get word ordinal.

Returns:
The word ordinal.

setOrd

public void setOrd(int ord)
Set word ordinal.

Parameters:
ord - The word ordinal.

getSentenceNumber

public int getSentenceNumber()
Get sentence number for word.

Returns:
The sentence number in which this word appears.

setSentenceNumber

public void setSentenceNumber(int sentenceNumber)
Set sentence number for word.

Parameters:
sentenceNumber - The sentence number in which this word appears.

getWordNumber

public int getWordNumber()
Get word number in sentence.

Returns:
The word number in the sentence.

setWordNumber

public void setWordNumber(int wordNumber)
Set word number in sentence.

Parameters:
wordNumber - The word number in the sentence.

getPageNumber

public int getPageNumber()
Get page number in which word appears.

Returns:
The page number.

setPageNumber

public void setPageNumber(int pageNumber)
Set page number in which word appears.

Parameters:
pageNumber - The page number.

getEOS

public boolean getEOS()
Get end of sentence flag.

Returns:
The end of sentence flag.

setEOS

public void setEOS(boolean eos)
Set end of sentence flag.

Parameters:
eos - The end of sentence flag.

getMainSide

public ExtendedAdornedWord.MainSide getMainSide()
Get main or side.

Returns:
The main or side string.

getFrontMiddleBack

public ExtendedAdornedWord.FrontMiddleBack getFrontMiddleBack()
Get front/middle/back.

Returns:
The main or side string.

getSpoken

public boolean getSpoken()
Get spoken flag.

Returns:
True if word is spoken.

setSpoken

public void setSpoken(boolean isSpoken)
Set spoken flag.

Parameters:
isSpoken - True if word is spoken.

getVerse

public boolean getVerse()
Get verse flag.

Returns:
True if word is in verse.

setVerse

public void setVerse(boolean isVerse)
Set verse flag.

Parameters:
isVerse - True if word is in verse.

getInJumpTag

public boolean getInJumpTag()
Get in jump tag flag.

Returns:
True if word is in jump tag.

setInJumpTag

public void setInJumpTag(boolean inJumpTag)
Set in jump tag flag.

Parameters:
inJumpTag - True if word is in jump tag.

getGap

public boolean getGap()
Get gap flag.

Returns:
True if word represents gap.

setGap

public void setGap(boolean isGap)
Set gap flag.

Parameters:
isGap - True if word represents gap.

getWordText

public java.lang.String getWordText()
Get word text.

Returns:
Word text.

getNextWord

public ExtendedAdornedWord getNextWord()
Get next word.

Returns:
The next word.

setNextWord

public void setNextWord(ExtendedAdornedWord nextWord)
Set next word.

Parameters:
nextWord - The next word.

getPreviousWord

public ExtendedAdornedWord getPreviousWord()
Get previous word.

Returns:
The previous word.

setPreviousWord

public void setPreviousWord(ExtendedAdornedWord previousWord)
Set previous word.

Parameters:
previousWord - The previous word.

getNextWordPart

public ExtendedAdornedWord getNextWordPart()
Get next word part.

Returns:
The next word part. Null if this is the last part or a non-split word.

setNextWordPart

public void setNextWordPart(ExtendedAdornedWord nextWordPart)
Set next word part.

Parameters:
nextWordPart - The next word part.

getPreviousWordPart

public ExtendedAdornedWord getPreviousWordPart()
Get previous word part.

Returns:
The previous word part for a split word. Null if this is the first part or a non-split word.

setPreviousWordPart

public void setPreviousWordPart(ExtendedAdornedWord previousWordPart)
Set previous word part.

Parameters:
previousWordPart - The previous word part.

getWordIndex

public int getWordIndex()
Get word index.

Returns:
The word index.

setWordIndex

public void setWordIndex(int wordIndex)
Set word index.

Parameters:
wordIndex - The word index.

appendWordText

public void appendWordText(char[] ch,
                           int start,
                           int length)
Append characters to word text.

Parameters:
ch - Array of characters.
start - The starting position in the array.
length - The number of characters.

toString

public java.lang.String toString()
Return spelling for toString.

Overrides:
toString in class BaseAdornedWord
Returns:
The spelling.

writeExternal

public void writeExternal(java.io.ObjectOutput out)
                   throws java.io.IOException
Writes the work set to an object output stream (serializes the object).

Specified by:
writeExternal in interface java.io.Externalizable
Parameters:
out - Object output stream.
Throws:
java.io.IOException

readExternal

public void readExternal(java.io.ObjectInput in)
                  throws java.io.IOException,
                         java.lang.ClassNotFoundException
Reads the work set from an object input stream (deserializes the object).

Specified by:
readExternal in interface java.io.Externalizable
Parameters:
in - Object input stream.
Throws:
java.io.IOException
java.lang.ClassNotFoundException