public class AdornAString
extends java.lang.Object
Usage:
java -Xmx256m edu.northwestern.at.morphadorner.example.AdornAString "Text to adorn."
where "Text to adorn." specifies one or more sentences of text to adorn with part of speech tags, lemmata, and standard spellings. The default tokenizer, sentence splitter, lexicons, part of speech tagger, lemmatizer, and spelling standardizer are used.
Example:
java -Xmx256m edu.northwestern.at.morphadorner.example.AdornAString "Mary had a little lamb. Its fleece was white as snow."
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
lemmaSeparator
Lemma separator character,
|
Constructor and Description |
---|
AdornAString() |
Modifier and Type | Method and Description |
---|---|
static void |
adornText(java.lang.String[] args)
Adorn text specified as a program parameter.
|
static void |
main(java.lang.String[] args)
Main program.
|
static void |
setLemma(AdornedWord adornedWord,
Lexicon lexicon,
Lemmatizer lemmatizer,
PartOfSpeechTags partOfSpeechTags,
WordTokenizer spellingTokenizer)
Get lemma for a word.
|
static void |
setStandardSpelling(AdornedWord adornedWord,
SpellingStandardizer standardizer,
PartOfSpeechTags partOfSpeechTags)
Get standard spelling for a word.
|
public static java.lang.String lemmaSeparator
public static void main(java.lang.String[] args)
args
- Program parameters.public static void adornText(java.lang.String[] args) throws java.lang.Exception
args
- The program parameters.
args[ 0 ] contains the text to adorn. The text may contain one or more sentences with punctuation.
java.lang.Exception
public static void setStandardSpelling(AdornedWord adornedWord, SpellingStandardizer standardizer, PartOfSpeechTags partOfSpeechTags)
adornedWord
- The adorned word.standardizer
- The spelling standardizer.partOfSpeechTags
- The part of speech tags.
On output, sets the standard spelling field of the adorned word
public static void setLemma(AdornedWord adornedWord, Lexicon lexicon, Lemmatizer lemmatizer, PartOfSpeechTags partOfSpeechTags, WordTokenizer spellingTokenizer)
adornedWord
- The adorned word.lexicon
- The word lexicon.lemmatizer
- The lemmatizer.partOfSpeechTags
- The part of speech tags.spellingTokenizer
- Tokenizer for spelling.
On output, sets the lemma field of the adorned word We look in the word lexicon first for the lemma. If the lexicon does not contain the lemma, we use the lemmatizer.