TrigramTagger (MorphAdorner)

java.lang.Object
- edu.northwestern.at.utils.IsCloseableObject
- - edu.northwestern.at.morphadorner.corpuslinguistics.postagger.AbstractPartOfSpeechTagger
  - - edu.northwestern.at.morphadorner.corpuslinguistics.postagger.trigram.TrigramTagger

All Implemented Interfaces:

UsesLexicon, PartOfSpeechTagger, IsCloseable, UsesLogger

Direct Known Subclasses:

DefaultPartOfSpeechTagger, PennTreebankPartOfSpeechTagger, TrigramHybridTagger
```
public class TrigramTagger
extends AbstractPartOfSpeechTagger
implements PartOfSpeechTagger
```
Trigram Part of Speech tagger.
The trigram part of speech tagger assigns tags to words in a sentence assigning the most probable set of tags as determined by a trigram hidden Markov model given the possible tags of the previous words. The Viterbi algorithm is used to reduce the amount of computation required to find the optimal tag assignments.

Field Summary

Fields
Modifier and Type	Field and Description
`protected int`	`beamSearchRejections` Total number of states rejected by beam search criterion.
`protected Map3D<java.lang.String,java.lang.String,java.lang.String,Probability>`	`contextualProbabilities` Contextual probabilities for a word in a sentence.
`protected boolean`	`debug` True for debug output.
`protected int`	`linesTagged` Count of lines tagged.
`protected Viterbi`	`viterbi` Viterbi trellis for tags and probability scores.
`protected int`	`wordsTagged` Count of words tagged.

Fields inherited from class edu.northwestern.at.morphadorner.corpuslinguistics.postagger.AbstractPartOfSpeechTagger
contextRules, contextualSmoother, dynamicLexicon, lexicalRules, lexicalSmoother, lexicon, logger, partOfSpeechGuesser, postTokenizer, retagger, ruleCorrections, transitionMatrix

Constructor Summary

Constructors
Constructor and Description

TrigramTagger()
Create a trigram tagger.

Constructors
Constructor and Description
`TrigramTagger()` Create a trigram tagger.

Method Summary

Methods
Modifier and Type	Method and Description
`protected java.util.List<java.lang.String>`	`processWord(int wordIndex, java.lang.String word, java.util.List<java.lang.String> previousPreviousTags, java.util.List<java.lang.String> previousTags, java.util.List<java.lang.String> tags)` Process a single word.
`protected void`	`reportEndOfTaggingStats()` Report end of tagging statistics.
`void`	`setLogger(Logger logger)` Set the logger.
`<T extends AdornedWord> java.util.List<T>`	`tagAdornedWordList(java.util.List<T> taggedSentence)` Tag a sentence comprised of a list of adorned words.
`<T extends AdornedWord> java.util.List<java.util.List<T>>`	`tagAdornedWordSentences(java.util.List<java.util.List<T>> sentences, java.util.Set<java.lang.String> regIDSet)` Tag a list of sentences containing adorned words.
`java.util.List<java.util.List<AdornedWord>>`	`tagSentences(java.util.List<java.util.List<java.lang.String>> sentences)` Tag a list of sentences.
`java.lang.String`	`toString()` Return tagger description.
`boolean`	`usesTransitionProbabilities()` See if tagger uses a probability transition matrix.

Methods inherited from class edu.northwestern.at.morphadorner.corpuslinguistics.postagger.AbstractPartOfSpeechTagger
clearRuleCorrections, createPartOfSpeechGuesser, getContextualSmoother, getDynamicLexicon, getLexicalSmoother, getLexicon, getLexicon, getLogger, getMostCommonTag, getPartOfSpeechGuesser, getPostTokenizer, getRetagger, getRuleCorrections, getTagCount, getTagsForWord, getTransitionMatrix, incrementRuleCorrections, retagWords, setContextRules, setContextualSmoother, setLexicalRules, setLexicalSmoother, setLexicon, setPartOfSpeechGuesser, setPostTokenizer, setRetagger, setTransitionMatrix, tagAdornedWordSentence, tagSentence, usesContextRules, usesLexicalRules

Methods inherited from class edu.northwestern.at.utils.IsCloseableObject
close

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface edu.northwestern.at.morphadorner.corpuslinguistics.postagger.PartOfSpeechTagger
clearRuleCorrections, getContextualSmoother, getLexicalSmoother, getLexicon, getLexicon, getPartOfSpeechGuesser, getPostTokenizer, getRetagger, getRuleCorrections, getTagCount, getTagsForWord, getTransitionMatrix, incrementRuleCorrections, retagWords, setContextRules, setContextualSmoother, setLexicalRules, setLexicalSmoother, setLexicon, setPartOfSpeechGuesser, setPostTokenizer, setRetagger, setTransitionMatrix, tagAdornedWordSentence, tagSentence, usesContextRules, usesLexicalRules

Methods inherited from interface edu.northwestern.at.utils.IsCloseable
close

- Field Detail
  - debug
```
protected boolean debug
```
    True for debug output.
  - contextualProbabilities
```
protected Map3D<java.lang.String,java.lang.String,java.lang.String,Probability> contextualProbabilities
```
    Contextual probabilities for a word in a sentence.
  - beamSearchRejections
```
protected int beamSearchRejections
```
    Total number of states rejected by beam search criterion.
  - viterbi
```
protected Viterbi viterbi
```
    Viterbi trellis for tags and probability scores.
  - linesTagged
```
protected int linesTagged
```
    Count of lines tagged.
  - wordsTagged
```
protected int wordsTagged
```
    Count of words tagged.
- Constructor Detail
  - TrigramTagger
```
public TrigramTagger()
```
    Create a trigram tagger.
- Method Detail
  - usesTransitionProbabilities
```
public boolean usesTransitionProbabilities()
```
    See if tagger uses a probability transition matrix.
    
    Specified by:
    
    usesTransitionProbabilities in interface PartOfSpeechTagger
    
    Overrides:
    
    usesTransitionProbabilities in class AbstractPartOfSpeechTagger
    
    Returns:
    True since trigram tagger uses a probability transition matrix.
  - reportEndOfTaggingStats
```
protected void reportEndOfTaggingStats()
```
    Report end of tagging statistics.
  - tagSentences
```
public java.util.List<java.util.List<AdornedWord>> tagSentences(java.util.List<java.util.List<java.lang.String>> sentences)
```
    Tag a list of sentences.
    
    Specified by:
    
    tagSentences in interface PartOfSpeechTagger
    
    Overrides:
    
    tagSentences in class AbstractPartOfSpeechTagger
    
    Parameters:
    sentences - The list of sentences.
    The sentences are a List of Lists of words to be tagged. Each sentence is represented as a list of words.
    
    Returns:
    The sentences with words adorned with parts of speech.
    The sentences are a List of Lists of words to be tagged. Each sentence is represented as a list of words. The output is a list of AdornedWords.
  - tagAdornedWordSentences
```
public <T extends AdornedWord> java.util.List<java.util.List<T>> tagAdornedWordSentences(java.util.List<java.util.List<T>> sentences,
                                                                                java.util.Set<java.lang.String> regIDSet)
```
    Tag a list of sentences containing adorned words.
    
    Specified by:
    
    tagAdornedWordSentences in interface PartOfSpeechTagger
    
    Overrides:
    
    tagAdornedWordSentences in class AbstractPartOfSpeechTagger
    
    Parameters:
    sentences - The list of sentences.
    regIDSet - Word IDs of words requiring special handling.
    The sentences are a List of Lists of adorn words to be tagged. Each sentence is represented as a list of words.
    
    Returns:
    The sentences with words adorned with parts of speech.
    The sentences are a List of Lists of adorned words to be tagged. Each sentence is represented as a list of words. The output is a list of AdornedWords.
  - tagAdornedWordList
```
public <T extends AdornedWord> java.util.List<T> tagAdornedWordList(java.util.List<T> taggedSentence)
```
    Tag a sentence comprised of a list of adorned words.
    
    Specified by:
    
    tagAdornedWordList in interface PartOfSpeechTagger
    
    Specified by:
    
    tagAdornedWordList in class AbstractPartOfSpeechTagger
    
    Parameters:
    taggedSentence - The sentence as an AdornedWord.
    
    Returns:
    An AdornedWord of the words in the sentence tagged with parts of speech.
    The input sentence is a AdornedWord of words to be tagged. The output is the same list of words with parts of speech added.
  - processWord
```
protected java.util.List<java.lang.String> processWord(int wordIndex,
                                           java.lang.String word,
                                           java.util.List<java.lang.String> previousPreviousTags,
                                           java.util.List<java.lang.String> previousTags,
                                           java.util.List<java.lang.String> tags)
```
    Process a single word.
    
    Parameters:
    wordIndex - Index of word in sentence (starts at 0).
    word - Word being processed.
    previousPreviousTags - The previous word's previous word's tags.
    previousTags - The previous word's tags.
    tags - The current word's tags.
    
    Returns:
    Updated tag list.
  - setLogger
```
public void setLogger(Logger logger)
```
    Set the logger.
    
    Specified by:
    
    setLogger in interface UsesLogger
    
    Overrides:
    
    setLogger in class AbstractPartOfSpeechTagger
    
    Parameters:
    logger - The logger.
  - toString
```
public java.lang.String toString()
```
    Return tagger description.
    
    Overrides:
    
    toString in class java.lang.Object
    
    Returns:
    Tagger description.

Class TrigramTagger

Field Summary

Fields inherited from class edu.northwestern.at.morphadorner.corpuslinguistics.postagger.AbstractPartOfSpeechTagger

Constructor Summary

Method Summary

Methods inherited from class edu.northwestern.at.morphadorner.corpuslinguistics.postagger.AbstractPartOfSpeechTagger

Methods inherited from class edu.northwestern.at.utils.IsCloseableObject

Methods inherited from class java.lang.Object

Methods inherited from interface edu.northwestern.at.morphadorner.corpuslinguistics.postagger.PartOfSpeechTagger

Methods inherited from interface edu.northwestern.at.utils.IsCloseable

Field Detail

debug

contextualProbabilities

beamSearchRejections

viterbi

linesTagged

wordsTagged

Constructor Detail

TrigramTagger

Method Detail

usesTransitionProbabilities

reportEndOfTaggingStats

tagSentences

tagAdornedWordSentences

tagAdornedWordList

processWord

setLogger

toString