public class SimpleTagger extends AbstractPartOfSpeechTagger implements PartOfSpeechTagger, CanTagOneWord
The simple part of speech tagger assigns a "noun" type part of speech to all words, except those that appear to be numbers. Numbers are assigned a "number" part of speech. Words starting with a capital letter can be assigned a separate "proper name" part of speech.
This simple tagger is useful as a backup for a more sophisticated tagger when unknown words are encountered.
Modifier and Type | Field and Description |
---|---|
protected static java.lang.String |
namePOS
Proper name part of speech tag.
|
protected static java.lang.String |
nounPOS
Noun part of speech tag.
|
protected static java.lang.String |
numberPOS
Number part of speech tag.
|
contextRules, contextualSmoother, dynamicLexicon, lexicalRules, lexicalSmoother, lexicon, logger, partOfSpeechGuesser, postTokenizer, retagger, ruleCorrections, transitionMatrix
Constructor and Description |
---|
SimpleTagger()
Create a simple tagger.
|
SimpleTagger(java.lang.String nounPOS,
java.lang.String namePOS,
java.lang.String numberPOS)
Create a simple tagger.
|
Modifier and Type | Method and Description |
---|---|
<T extends AdornedWord> |
tagAdornedWordList(java.util.List<T> sentence)
Tag a sentence.
|
java.lang.String |
tagWord(AdornedWord word)
Tag a single adorned word.
|
java.lang.String |
tagWord(java.lang.String word)
Tag a single word.
|
java.lang.String |
toString()
Return tagger description.
|
clearRuleCorrections, createPartOfSpeechGuesser, getContextualSmoother, getDynamicLexicon, getLexicalSmoother, getLexicon, getLexicon, getLogger, getMostCommonTag, getPartOfSpeechGuesser, getPostTokenizer, getRetagger, getRuleCorrections, getTagCount, getTagsForWord, getTransitionMatrix, incrementRuleCorrections, retagWords, setContextRules, setContextualSmoother, setLexicalRules, setLexicalSmoother, setLexicon, setLogger, setPartOfSpeechGuesser, setPostTokenizer, setRetagger, setTransitionMatrix, tagAdornedWordSentence, tagAdornedWordSentences, tagSentence, tagSentences, usesContextRules, usesLexicalRules, usesTransitionProbabilities
close
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
clearRuleCorrections, getContextualSmoother, getLexicalSmoother, getLexicon, getLexicon, getPartOfSpeechGuesser, getPostTokenizer, getRetagger, getRuleCorrections, getTagCount, getTagsForWord, getTransitionMatrix, incrementRuleCorrections, retagWords, setContextRules, setContextualSmoother, setLexicalRules, setLexicalSmoother, setLexicon, setPartOfSpeechGuesser, setPostTokenizer, setRetagger, setTransitionMatrix, tagAdornedWordSentence, tagAdornedWordSentences, tagSentence, tagSentences, usesContextRules, usesLexicalRules, usesTransitionProbabilities
close
protected static java.lang.String nounPOS
protected static java.lang.String namePOS
protected static java.lang.String numberPOS
public SimpleTagger()
public SimpleTagger(java.lang.String nounPOS, java.lang.String namePOS, java.lang.String numberPOS)
nounPOS
- Part of speech for a noun.namePOS
- Part of speech for a proper name.numberPOS
- Part of speech tag for a number.public <T extends AdornedWord> java.util.List<T> tagAdornedWordList(java.util.List<T> sentence)
tagAdornedWordList
in interface PartOfSpeechTagger
tagAdornedWordList
in class AbstractPartOfSpeechTagger
sentence
- The sentence as an AdornedWord
public java.lang.String tagWord(java.lang.String word)
tagWord
in interface CanTagOneWord
word
- The word.public java.lang.String tagWord(AdornedWord word)
tagWord
in interface CanTagOneWord
word
- The adorned word.public java.lang.String toString()
toString
in class java.lang.Object