Northwestern University Information Technology
|MorphAdorner V2.0||Site Map|
NGramTaggerTrainer merges the contents of multiple word list files into a single file. A word list file contains a list of words, one word on each line.
ngramtaggertrainer trainingdata.tab wordlexicon.lex transitionmatrix.mat
The training data file is a tab-separated utf-8 file containing the part of speech training data generated from the training texts. We only use the first two columns of the training data.
The word lexicon is a MorphAdorner format word lexicon.
The output tag transition file is a utf-8 file containing the data needed by the MorphAdorner bigram and trigram taggers.
|Announcements and News|
|Announcements and news about changes to MorphAdorner|
|Documentation for using MorphAdorner|
|Downloading and installing the MorphAdorner client and server software|
|Glossary of MorphAdorner terms|
|Natural language processing references|
|Licenses for MorphAdorner and Associated Software|
|Online examples of MorphAdorner Server facilities.|
|Slides from talks about MorphAdorner.|
|Technical information for programmers using MorphAdorner|
Academic Technologies and Research Services,
NU Library 2East, 1970 Campus Drive Evanston, IL 60208. |