edu.northwestern.at.utils.corpuslinguistics.postagger.simple
Class SimpleTagger

java.lang.Object
  extended by edu.northwestern.at.utils.IsCloseableObject
      extended by edu.northwestern.at.utils.corpuslinguistics.postagger.AbstractPartOfSpeechTagger
          extended by edu.northwestern.at.utils.corpuslinguistics.postagger.simple.SimpleTagger
All Implemented Interfaces:
UsesLexicon, CanTagOneWord, PartOfSpeechTagger, IsCloseable, UsesLogger

public class SimpleTagger
extends AbstractPartOfSpeechTagger
implements PartOfSpeechTagger, CanTagOneWord

Simple Part of Speech tagger.

The simple part of speech tagger assigns a "noun" type part of speech to all words, except those that appear to be numbers. Numbers are assigned a "number" part of speech. Words starting with a capital letter can be assigned a separate "proper name" part of speech.

This simple tagger is useful as a backup for a more sophisticated tagger when unknown words are encountered.


Field Summary
protected static java.lang.String namePOS
          Proper name part of speech tag.
protected static java.lang.String nounPOS
          Noun part of speech tag.
protected static java.lang.String numberPOS
          Number part of speech tag.
 
Fields inherited from class edu.northwestern.at.utils.corpuslinguistics.postagger.AbstractPartOfSpeechTagger
contextRules, contextualSmoother, dynamicLexicon, lexicalRules, lexicalSmoother, lexicon, logger, partOfSpeechGuesser, postTokenizer, retagger, ruleCorrections, transitionMatrix
 
Constructor Summary
SimpleTagger()
          Create a simple tagger.
SimpleTagger(java.lang.String nounPOS, java.lang.String namePOS, java.lang.String numberPOS)
          Create a simple tagger.
 
Method Summary
<T extends AdornedWord>
java.util.List<T>
tagAdornedWordList(java.util.List<T> sentence)
          Tag a sentence.
 java.lang.String tagWord(AdornedWord word)
          Tag a single adorned word.
 java.lang.String tagWord(java.lang.String word)
          Tag a single word.
 java.lang.String toString()
          Return tagger description.
 
Methods inherited from class edu.northwestern.at.utils.corpuslinguistics.postagger.AbstractPartOfSpeechTagger
clearRuleCorrections, createPartOfSpeechGuesser, getDynamicLexicon, getLexicon, getLexicon, getLogger, getMostCommonTag, getPartOfSpeechGuesser, getRetagger, getRuleCorrections, getTagCount, getTagsForWord, getTransitionMatrix, incrementRuleCorrections, retagWords, setContextRules, setLexicalRules, setLexicon, setLogger, setPartOfSpeechGuesser, setRetagger, setTransitionMatrix, tagAdornedWordSentence, tagAdornedWordSentences, tagSentence, tagSentences, usesContextRules, usesLexicalRules, usesTransitionProbabilities
 
Methods inherited from class edu.northwestern.at.utils.IsCloseableObject
close
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface edu.northwestern.at.utils.corpuslinguistics.postagger.PartOfSpeechTagger
clearRuleCorrections, getLexicon, getLexicon, getPartOfSpeechGuesser, getRetagger, getRuleCorrections, getTagCount, getTagsForWord, getTransitionMatrix, incrementRuleCorrections, retagWords, setContextRules, setLexicalRules, setLexicon, setPartOfSpeechGuesser, setRetagger, setTransitionMatrix, tagAdornedWordSentence, tagAdornedWordSentences, tagSentence, tagSentences, usesContextRules, usesLexicalRules, usesTransitionProbabilities
 
Methods inherited from interface edu.northwestern.at.utils.IsCloseable
close
 

Field Detail

nounPOS

protected static java.lang.String nounPOS
Noun part of speech tag.


namePOS

protected static java.lang.String namePOS
Proper name part of speech tag.


numberPOS

protected static java.lang.String numberPOS
Number part of speech tag.

Constructor Detail

SimpleTagger

public SimpleTagger()
Create a simple tagger.


SimpleTagger

public SimpleTagger(java.lang.String nounPOS,
                    java.lang.String namePOS,
                    java.lang.String numberPOS)
Create a simple tagger.

Parameters:
nounPOS - Part of speech for a noun.
namePOS - Part of speech for a proper name.
numberPOS - Part of speech tag for a number.
Method Detail

tagAdornedWordList

public <T extends AdornedWord> java.util.List<T> tagAdornedWordList(java.util.List<T> sentence)
Tag a sentence.

Specified by:
tagAdornedWordList in interface PartOfSpeechTagger
Specified by:
tagAdornedWordList in class AbstractPartOfSpeechTagger
Parameters:
sentence - The sentence as an AdornedWord
Returns:
The input List with its words tagged with parts of speech.

tagWord

public java.lang.String tagWord(java.lang.String word)
Tag a single word.

Specified by:
tagWord in interface CanTagOneWord
Parameters:
word - The word.
Returns:
The part of speech for the word.

tagWord

public java.lang.String tagWord(AdornedWord word)
Tag a single adorned word.

Specified by:
tagWord in interface CanTagOneWord
Parameters:
word - The adorned word.
Returns:
The adorned word with the part of speech assigned.

toString

public java.lang.String toString()
Return tagger description.

Overrides:
toString in class java.lang.Object
Returns:
Tagger description.