public interface PartOfSpeechTagger
Modifier and Type | Method and Description |
---|---|
void |
clearRuleCorrections()
Clear count of successful rule applications.
|
ContextualSmoother |
getContextualSmoother()
Get the contextual smoother.
|
LexicalSmoother |
getLexicalSmoother()
Get the lexical smoother.
|
Lexicon |
getLexicon()
Get the lexicon.
|
Lexicon |
getLexicon(java.lang.String word)
Get the lexicon for a specific word.
|
PartOfSpeechGuesser |
getPartOfSpeechGuesser()
Get part of speech guesser.
|
PostTokenizer |
getPostTokenizer()
Get the post tokenizer.
|
PartOfSpeechRetagger |
getRetagger()
Get part of speech retagger.
|
int |
getRuleCorrections()
Get count of successful rule applications.
|
int |
getTagCount(java.lang.String word,
java.lang.String tag)
Get count of times a word appears with a given tag.
|
java.util.List<java.lang.String> |
getTagsForWord(java.lang.String word)
Get potential part of speech tags for a word.
|
TransitionMatrix |
getTransitionMatrix()
Get tag transition probabilities matrix.
|
void |
incrementRuleCorrections()
Increment count of successful rule applications.
|
<T extends AdornedWord> |
retagWords(java.util.List<T> taggedSentence)
Retag words in a tagged sentence.
|
void |
setContextRules(java.lang.String[] contextRules)
Set context rules for tagging.
|
void |
setContextualSmoother(ContextualSmoother contextualSmoother)
Set the contextual smoother.
|
void |
setLexicalRules(java.lang.String[] lexicalRules)
Set lexical rules for tagging.
|
void |
setLexicalSmoother(LexicalSmoother lexicalSmoother)
Set the lexical smoother.
|
void |
setLexicon(Lexicon lexicon)
Set the lexicon.
|
void |
setPartOfSpeechGuesser(PartOfSpeechGuesser guesser)
Set part of speech guesser.
|
void |
setPostTokenizer(PostTokenizer postTokenizer)
Set the post tokenizer.
|
void |
setRetagger(PartOfSpeechRetagger retagger)
Set part of speech retagger.
|
void |
setTransitionMatrix(TransitionMatrix transitionMatrix)
Set tag transition probabilities matrix.
|
<T extends AdornedWord> |
tagAdornedWordList(java.util.List<T> sentence)
Tag a list of adorned words.
|
<T extends AdornedWord> |
tagAdornedWordSentence(java.util.List<T> sentence,
java.util.Set<java.lang.String> regIDSet)
Tag a sentence.
|
<T extends AdornedWord> |
tagAdornedWordSentences(java.util.List<java.util.List<T>> sentences,
java.util.Set<java.lang.String> regIDSet)
Tag a list of sentences.
|
java.util.List<AdornedWord> |
tagSentence(java.util.List<java.lang.String> sentence)
Tag a sentence.
|
java.util.List<java.util.List<AdornedWord>> |
tagSentences(java.util.List<java.util.List<java.lang.String>> sentences)
Tag a list of sentences.
|
boolean |
usesContextRules()
See if tagger uses context rules.
|
boolean |
usesLexicalRules()
See if tagger uses lexical rules.
|
boolean |
usesTransitionProbabilities()
See if tagger uses a probability transition matrix.
|
boolean usesContextRules()
boolean usesLexicalRules()
boolean usesTransitionProbabilities()
void setContextRules(java.lang.String[] contextRules) throws InvalidRuleException
contextRules
- String array of context rules.InvalidRuleException
- if a rule is bad.
For taggers which do not use context rules, this is a no-op.
void setLexicalRules(java.lang.String[] lexicalRules) throws InvalidRuleException
lexicalRules
- String array of lexical rules.InvalidRuleException
- if a rule is bad.
For taggers which do not use lexical rules, this is a no-op.
Lexicon getLexicon()
Lexicon getLexicon(java.lang.String word)
word
- The word whose associated
lexicon we want.void setLexicon(Lexicon lexicon)
lexicon
- Lexicon used for tagging.TransitionMatrix getTransitionMatrix()
void setTransitionMatrix(TransitionMatrix transitionMatrix)
transitionMatrix
- Tag probabilities transition matrix.
For taggers which do not use transition matrices, this is a no-op.
PartOfSpeechGuesser getPartOfSpeechGuesser()
void setPartOfSpeechGuesser(PartOfSpeechGuesser guesser)
guesser
- The part of speech guesser.PartOfSpeechRetagger getRetagger()
void setRetagger(PartOfSpeechRetagger retagger)
retagger
- The part of speech retagger.PostTokenizer getPostTokenizer()
void setPostTokenizer(PostTokenizer postTokenizer)
postTokenizer
- The post tokenizer.ContextualSmoother getContextualSmoother()
void setContextualSmoother(ContextualSmoother contextualSmoother)
contextualSmoother
- The contextual smoother.LexicalSmoother getLexicalSmoother()
void setLexicalSmoother(LexicalSmoother lexicalSmoother)
lexicalSmoother
- The lexical smoother.java.util.List<java.lang.String> getTagsForWord(java.lang.String word)
word
- The word whose part of speech tags we want.int getTagCount(java.lang.String word, java.lang.String tag)
word
- The word.tag
- The part of speech tag.void clearRuleCorrections()
void incrementRuleCorrections()
int getRuleCorrections()
java.util.List<java.util.List<AdornedWord>> tagSentences(java.util.List<java.util.List<java.lang.String>> sentences)
sentences
- The list of sentences.
The sentences are a List
of
List
s of words to be tagged.
Each sentence is represented as a list of
words. The output is a list of
AdornedWord
s.
java.util.List<AdornedWord> tagSentence(java.util.List<java.lang.String> sentence)
sentence
- The sentence as a List of string tokens.AdornedWord
.<T extends AdornedWord> java.util.List<T> tagAdornedWordSentence(java.util.List<T> sentence, java.util.Set<java.lang.String> regIDSet)
sentence
- The sentence as a list of string words.regIDSet
- Set of word IDs requiring special handling.
May be null.AdornedWord
of the words in the sentence tagged with
parts of speech.
The input sentence is a List
of
adorned words to be tagged. The output is
the same list with parts of speech added/modified.
<T extends AdornedWord> java.util.List<java.util.List<T>> tagAdornedWordSentences(java.util.List<java.util.List<T>> sentences, java.util.Set<java.lang.String> regIDSet)
sentences
- The list of sentences.regIDSet
- Set of word IDs requiring special handling.
May be null.
The sentences are a List
of
List
s of adorned words to be tagged.
Each sentence is represented as a list of
words. The output is a list of
AdornedWord
s.
<T extends AdornedWord> java.util.List<T> tagAdornedWordList(java.util.List<T> sentence)
sentence
- The sentence as an
AdornedWord
.<T extends AdornedWord> java.util.List<T> retagWords(java.util.List<T> taggedSentence)
taggedSentence
- The tagged sentence as an
AdornedWord
.