BigramTagger (MorphAdorner)

java.lang.Object
- edu.northwestern.at.utils.IsCloseableObject
- - edu.northwestern.at.morphadorner.corpuslinguistics.postagger.AbstractPartOfSpeechTagger
  - - edu.northwestern.at.morphadorner.corpuslinguistics.postagger.bigram.BigramTagger

All Implemented Interfaces:

UsesLexicon, PartOfSpeechTagger, IsCloseable, UsesLogger

Direct Known Subclasses:

BigramHybridTagger
```
public class BigramTagger
extends AbstractPartOfSpeechTagger
implements PartOfSpeechTagger
```
Bigram Part of Speech tagger.
The bigram part of speech tagger assigns tags to words in a sentence assigning the most probable set of tags as determined by a bigram hidden Markov model given the possible tags of the previous words. The Viterbi algorithm is used to reduce the amount of computation required to find the optimal tag assignments.

Field Summary

Fields
Modifier and Type	Field and Description
`protected int`	`beamSearchRejections` Total number of states rejected by beam search criterion.
`protected Map2D<java.lang.String,java.lang.String,Probability>`	`contextualProbabilities` Contextual probabilities for a word in a sentence.
`protected boolean`	`debug` True for debug output.
`protected Viterbi`	`viterbi` Viterbi trellis for tags and probability scores.

Fields inherited from class edu.northwestern.at.morphadorner.corpuslinguistics.postagger.AbstractPartOfSpeechTagger
contextRules, contextualSmoother, dynamicLexicon, lexicalRules, lexicalSmoother, lexicon, logger, partOfSpeechGuesser, postTokenizer, retagger, ruleCorrections, transitionMatrix

Constructor Summary

Constructors
Constructor and Description

BigramTagger()
Create a bigram tagger.

Constructors
Constructor and Description
`BigramTagger()` Create a bigram tagger.

Method Summary

Methods
Modifier and Type	Method and Description
`protected java.util.List<java.lang.String>`	`processWord(int wordIndex, java.lang.String word, java.util.List<java.lang.String> previousTags, java.util.List<java.lang.String> tags)` Process a single word.
`void`	`setLogger(Logger logger)` Set the logger.
`<T extends AdornedWord> java.util.List<T>`	`tagAdornedWordList(java.util.List<T> taggedSentence)` Tag a sentence.
`java.util.List<java.util.List<AdornedWord>>`	`tagSentences(java.util.List<java.util.List<java.lang.String>> sentences)` Tag a list of sentences.
`java.lang.String`	`toString()` Return tagger description.
`boolean`	`usesTransitionProbabilities()` See if tagger uses a probability transition matrix.

Methods inherited from class edu.northwestern.at.morphadorner.corpuslinguistics.postagger.AbstractPartOfSpeechTagger
clearRuleCorrections, createPartOfSpeechGuesser, getContextualSmoother, getDynamicLexicon, getLexicalSmoother, getLexicon, getLexicon, getLogger, getMostCommonTag, getPartOfSpeechGuesser, getPostTokenizer, getRetagger, getRuleCorrections, getTagCount, getTagsForWord, getTransitionMatrix, incrementRuleCorrections, retagWords, setContextRules, setContextualSmoother, setLexicalRules, setLexicalSmoother, setLexicon, setPartOfSpeechGuesser, setPostTokenizer, setRetagger, setTransitionMatrix, tagAdornedWordSentence, tagAdornedWordSentences, tagSentence, usesContextRules, usesLexicalRules

Methods inherited from class edu.northwestern.at.utils.IsCloseableObject
close

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface edu.northwestern.at.morphadorner.corpuslinguistics.postagger.PartOfSpeechTagger
clearRuleCorrections, getContextualSmoother, getLexicalSmoother, getLexicon, getLexicon, getPartOfSpeechGuesser, getPostTokenizer, getRetagger, getRuleCorrections, getTagCount, getTagsForWord, getTransitionMatrix, incrementRuleCorrections, retagWords, setContextRules, setContextualSmoother, setLexicalRules, setLexicalSmoother, setLexicon, setPartOfSpeechGuesser, setPostTokenizer, setRetagger, setTransitionMatrix, tagAdornedWordSentence, tagAdornedWordSentences, tagSentence, usesContextRules, usesLexicalRules

Methods inherited from interface edu.northwestern.at.utils.IsCloseable
close

- Field Detail
  - debug
```
protected boolean debug
```
    True for debug output.
  - contextualProbabilities
```
protected Map2D<java.lang.String,java.lang.String,Probability> contextualProbabilities
```
    Contextual probabilities for a word in a sentence.
  - beamSearchRejections
```
protected int beamSearchRejections
```
    Total number of states rejected by beam search criterion.
  - viterbi
```
protected Viterbi viterbi
```
    Viterbi trellis for tags and probability scores.
- Constructor Detail
  - BigramTagger
```
public BigramTagger()
```
    Create a bigram tagger.
- Method Detail
  - usesTransitionProbabilities
```
public boolean usesTransitionProbabilities()
```
    See if tagger uses a probability transition matrix.
    
    Specified by:
    
    usesTransitionProbabilities in interface PartOfSpeechTagger
    
    Overrides:
    
    usesTransitionProbabilities in class AbstractPartOfSpeechTagger
    
    Returns:
    True since bigram tagger uses probability transition matrix.
  - tagSentences
```
public java.util.List<java.util.List<AdornedWord>> tagSentences(java.util.List<java.util.List<java.lang.String>> sentences)
```
    Tag a list of sentences.
    
    Specified by:
    
    tagSentences in interface PartOfSpeechTagger
    
    Overrides:
    
    tagSentences in class AbstractPartOfSpeechTagger
    
    Parameters:
    sentences - The list of sentences.
    
    Returns:
    The sentences with words adorned with parts of speech.
    The sentences are a List of Lists of words to be tagged. Each sentence is represented as a list of words. The output is a list of AdornedWords.
  - tagAdornedWordList
```
public <T extends AdornedWord> java.util.List<T> tagAdornedWordList(java.util.List<T> taggedSentence)
```
    Tag a sentence.
    
    Specified by:
    
    tagAdornedWordList in interface PartOfSpeechTagger
    
    Specified by:
    
    tagAdornedWordList in class AbstractPartOfSpeechTagger
    
    Parameters:
    taggedSentence - The sentence as an AdornedWord.
    
    Returns:
    An AdornedWord of the words in the sentence tagged with parts of speech.
    The input sentence is a List of string words to be tagged. The output is AdornedWord of the words with parts of speech added.
  - processWord
```
protected java.util.List<java.lang.String> processWord(int wordIndex,
                                           java.lang.String word,
                                           java.util.List<java.lang.String> previousTags,
                                           java.util.List<java.lang.String> tags)
```
    Process a single word.
    
    Parameters:
    wordIndex - Index of word in sentence (starts at 0).
    word - Word being processed.
    previousTags - The previous word's tags.
    tags - The current word's tags.
    
    Returns:
    Updated tag list.
  - setLogger
```
public void setLogger(Logger logger)
```
    Set the logger.
    
    Specified by:
    
    setLogger in interface UsesLogger
    
    Overrides:
    
    setLogger in class AbstractPartOfSpeechTagger
    
    Parameters:
    logger - The logger.
  - toString
```
public java.lang.String toString()
```
    Return tagger description.
    
    Overrides:
    
    toString in class java.lang.Object
    
    Returns:
    Tagger description.

Class BigramTagger

Field Summary

Fields inherited from class edu.northwestern.at.morphadorner.corpuslinguistics.postagger.AbstractPartOfSpeechTagger

Constructor Summary

Method Summary

Methods inherited from class edu.northwestern.at.morphadorner.corpuslinguistics.postagger.AbstractPartOfSpeechTagger

Methods inherited from class edu.northwestern.at.utils.IsCloseableObject

Methods inherited from class java.lang.Object

Methods inherited from interface edu.northwestern.at.morphadorner.corpuslinguistics.postagger.PartOfSpeechTagger

Methods inherited from interface edu.northwestern.at.utils.IsCloseable

Field Detail

debug

contextualProbabilities

beamSearchRejections

viterbi

Constructor Detail

BigramTagger

Method Detail

usesTransitionProbabilities

tagSentences

tagAdornedWordList

processWord

setLogger

toString