NU
IT
Northwestern University Information Technology |
MorphAdorner V2.0 | Site Map |
CreateSuffixLexicon creates a suffix lexicon from a word lexicon.
Usage:
createsuffixlexicon inputwordlexicon.lex suffixlexicon.lex maxsuffixlength maxsuffixcount allowedpostagsfilename
where
inputwordlexicon.lex specifies the name of an input word lexicon in MorphAdorner format to receive the word lexicon.
suffixlexicon.lex specifies the name of the output file to receive tthe suffix lexicon.
maxsuffixlength specifies the maximum length suffix generated for the suffix lexicon. The default is 6 characters.
maxsuffixcount specifies the maximum number of times a spelling must appear in order for its suffix to be added to the suffix lexicon. The default is to include all words regardless of count.
For some applications you may want to restrict the suffix lexicon to contain suffixes only for infrequently occurring words. Values of 10 (only include spellings which appear 10 or less times in the training data) or 1 (only include spellings which appear once in the training data) are popular choices.
allowedpostagsfilename specifies the name of a file containing a list of part of speech tags to use when constructing the suffix lexicon. Omit the tags for parts of speech for closed word classes to which new words should not be added. The MorphAdorner release provides the file nuposallowedpostags.txt in the release data directory which defines a default set of NUPos tags to use when creating a suffix lexicon.
The suffix lexicon is used by the part of speech taggers to guess the potential parts of speech for unknown words which do not appear in the word lexicon. For each successively shorter ending substring of the unknown word, the guesser looks up that substring in the suffix lexicon. When the substring exists in the suffix lexicon, the guesser assigns its associated parts of speech to the unknown word.
Home | |
Welcome | |
Announcements and News | |
Announcements and news about changes to MorphAdorner | |
Documentation | |
Documentation for using MorphAdorner | |
Download MorphAdorner | |
Downloading and installing the MorphAdorner client and server software | |
Glossary | |
Glossary of MorphAdorner terms | |
Helpful References | |
Natural language processing references | |
Licenses | |
Licenses for MorphAdorner and Associated Software | |
Server | |
Online examples of MorphAdorner Server facilities. | |
Talks | |
Slides from talks about MorphAdorner. | |
Tech Talk | |
Technical information for programmers using MorphAdorner |
Academic Technologies and Research Services,
NU Library 2East, 1970 Campus Drive Evanston, IL 60208. |
Contact Us.
|