Morphological Adorner

MorphAdorner adorns words in text with morphological tags.

See: Description

Packages 
Package Description
com.cybozu.labs.langdetect
Ngram-based language detection methods for text.
com.cybozu.labs.langdetect.util
Utilities used by the language detection methods.
com.megginson.sax
SAX-based XML output filters.
com.rmtheis.langdetect.profile
Contains the interface for language profiles for the Cybozu Language Detector as well as the profiles themselves.
edu.northwestern.at.morphadorner
MorphAdorner adorns texts with word-based morphological information such as parts of speech and lemmata.
edu.northwestern.at.morphadorner.corpuslinguistics.abbreviations
Abbreviations.
edu.northwestern.at.morphadorner.corpuslinguistics.adornedword
Adorned Word.
edu.northwestern.at.morphadorner.corpuslinguistics.apostokens
Tokens which start or end with apostrophes.
edu.northwestern.at.morphadorner.corpuslinguistics.contractionexpander
Contraction Expander.
edu.northwestern.at.morphadorner.corpuslinguistics.hyphenator
Syllable counter.
edu.northwestern.at.morphadorner.corpuslinguistics.inflector
Inflector.
edu.northwestern.at.morphadorner.corpuslinguistics.inflector.conjugator
Conjugator.
edu.northwestern.at.morphadorner.corpuslinguistics.inflector.pluralizer
Pluralizer.
edu.northwestern.at.morphadorner.corpuslinguistics.inflector.wordrule
WordRule.
edu.northwestern.at.morphadorner.corpuslinguistics.inputter
Text inputter for morphadorner.
edu.northwestern.at.morphadorner.corpuslinguistics.languagerecognizer
Language recognizer.
edu.northwestern.at.morphadorner.corpuslinguistics.lemmatizer
Lemmatization.
edu.northwestern.at.morphadorner.corpuslinguistics.lexicon
Lexicon of spelling, lemmata, and parts of speech.
edu.northwestern.at.morphadorner.corpuslinguistics.multiwordunits
Classes for extracting and manipulating multiword units.
edu.northwestern.at.morphadorner.corpuslinguistics.namerecognizer
Finds named entities in text.
edu.northwestern.at.morphadorner.corpuslinguistics.namestandardizer
Name standardizer.
edu.northwestern.at.morphadorner.corpuslinguistics.ngram
Classes for creating and manipulating word ngrams.
edu.northwestern.at.morphadorner.corpuslinguistics.outputter
Output generation for adorned text.
edu.northwestern.at.morphadorner.corpuslinguistics.partsofspeech
Classes and methods for manipulating part of speech tags.
edu.northwestern.at.morphadorner.corpuslinguistics.partsofspeech.mapper
Classes and methods for mapping one part of speech tag set to another.
edu.northwestern.at.morphadorner.corpuslinguistics.phonetics
Classes for generating phonetic values for strings.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger
Methods and interfaces for part of speech tagging and lemmatization.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.affix
Affix part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.allunknown
All unknown part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.bigram
Bigram part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.bigramhybrid
Hybrid bigram part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.guesser
Guesses parts of speech for unknown words.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.hepple
Implements Mark Hepple's part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.hepple.rules
Implements tagging rules for Mark Hepple's part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.iretagger
Retagger to correct "I" tagging issues.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.noopretagger
Retagger which leaves initial tagging undisturbed.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.propernounretagger
Retagger to correct proper noun tagging issues.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.regexp
Regular expression-based part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.simple
Simple part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.simplerulebased
Simple rule-based part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.smoothing.contextual
Methods and interfaces for lexical and contextual smoothing for part of speech taggers.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.smoothing.lexical
Methods and interfaces for lexical and contextual smoothing for part of speech taggers.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.suffix
Suffix part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.tcpretagger
Retagger to correct TCP text issues.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.transitionmatrix
Transition matrix.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.trigram
Trigram part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.trigramhybrid
Hybrid trigram part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.unigram
Unigram part of speech tagger.
edu.northwestern.at.morphadorner.corpuslinguistics.sentencemelder
Melds a list of words and punctuation into formatted sentences.
edu.northwestern.at.morphadorner.corpuslinguistics.sentencesplitter
Splits text into sentences.
edu.northwestern.at.morphadorner.corpuslinguistics.spellingmapper
BritishToUS is a simple filter which maps British spellings to American (US) spellings.
edu.northwestern.at.morphadorner.corpuslinguistics.spellingstandardizer
Spelling standardization.
edu.northwestern.at.morphadorner.corpuslinguistics.statistics
Methods and interfaces for statistical methods useful in corpus linguistics.
edu.northwestern.at.morphadorner.corpuslinguistics.stemmer
Stemming.
edu.northwestern.at.morphadorner.corpuslinguistics.stopwords
Stop words.
edu.northwestern.at.morphadorner.corpuslinguistics.stringsimilarity
Methods for computing the similarity of strings.
edu.northwestern.at.morphadorner.corpuslinguistics.syllablecounter
Syllable counter.
edu.northwestern.at.morphadorner.corpuslinguistics.textsegmenter
Text Segmentation.
edu.northwestern.at.morphadorner.corpuslinguistics.textsegmenter.c99
C99 text segmentation.
edu.northwestern.at.morphadorner.corpuslinguistics.textsegmenter.struct
Utilities for linear text segmentation.
edu.northwestern.at.morphadorner.corpuslinguistics.textsegmenter.texttiling
Text Tiling text segmentation.
edu.northwestern.at.morphadorner.corpuslinguistics.textsummarizer
Text Summarization.
edu.northwestern.at.morphadorner.corpuslinguistics.thesaurus
Thesaurus.
edu.northwestern.at.morphadorner.corpuslinguistics.tokenizer
Text tokenization.
edu.northwestern.at.morphadorner.corpuslinguistics.wordcounts
Word Counts.
edu.northwestern.at.morphadorner.examples
Example programs using MorphAdorner facilities.
edu.northwestern.at.morphadorner.gate
GATE interfaces for MorphAdorner components.
edu.northwestern.at.morphadorner.tei
Utility classes for TEI XML processing.
edu.northwestern.at.morphadorner.tools
Contains a variety of utility tools for creating and manipulating data files for use with MorphAdorner.
edu.northwestern.at.morphadorner.tools.addcharacteroffsets
Create derived MorphAdorner files with character offsets to word tokens.
edu.northwestern.at.morphadorner.tools.addpseudopages
Adds pseudopage milestones to an adorned file.
edu.northwestern.at.morphadorner.tools.adornedtosimpleteip5
AdornedToSimpleTEIP5 converts a base-level MorphAdorner file to a more TEI P5-like format.
edu.northwestern.at.morphadorner.tools.adornedtosketch

AdornedToSketch converts one or more adorned files to the verticalized input required by the Sketch or NoSketch corpus search engines.

edu.northwestern.at.morphadorner.tools.adornedtotcf

AdornedToTCF04 converts one or more adorned files to the Text Corpus Format (TCF) v0.4 used by the CLARIN-D project.

edu.northwestern.at.morphadorner.tools.annolex
Utilities for merging Annolex generated corrections with adorned XML files.
edu.northwestern.at.morphadorner.tools.applyxslt
Applies XSLT transformation to one or more files.
edu.northwestern.at.morphadorner.tools.compareadornedfiles
Classes and utilities for comparing token streams in adorned files and logging the differences to XML format files.
edu.northwestern.at.morphadorner.tools.comparestringcounts
Compare string counts in two files using Dunning's log-likelihood.
edu.northwestern.at.morphadorner.tools.countadornedwords
Counts adorned words by processing XMLToTab output.
edu.northwestern.at.morphadorner.tools.countaffixes
Counts affixes (suffixes and prefixes) of adorned words by processing MorphAdorner XML output.
edu.northwestern.at.morphadorner.tools.createlexicon
Generates a MorphAdorner lexicon from training data.
edu.northwestern.at.morphadorner.tools.createsuffixlexicon
Generates a MorphAdorner suffix lexicon from a word lexicon.
edu.northwestern.at.morphadorner.tools.findteitextlanguage
Determines the language(s) in which a TEI text is written.
edu.northwestern.at.morphadorner.tools.fixquotes
Fix quote marks in text and XML files.
edu.northwestern.at.morphadorner.tools.lgparser
Link grammar parser driver.
edu.northwestern.at.morphadorner.tools.mergebrilllexicon
Merges Brill style lexicon with MorphAdorned lexicon.
edu.northwestern.at.morphadorner.tools.mergeenhancedbrilllexicon
Merges enhanced Brill style lexicon with MorphAdorned lexicon.
edu.northwestern.at.morphadorner.tools.mergespellingdata
Merges multiple spelling map word lists into a single file.
edu.northwestern.at.morphadorner.tools.mergetextfiles
Merges multiple text files into a single file.
edu.northwestern.at.morphadorner.tools.mergewordlists
Merges multiple word list files into a single file.
edu.northwestern.at.morphadorner.tools.namedentities
AdornWithNamedEntities adorns texts with named entities such as person, location, time, date, and organization.
edu.northwestern.at.morphadorner.tools.punktabbreviationdetector
PunktAbbreviationDetector uses the Punkt algorithm of Kiss and Strunk to decide whether a token containing one or more periods is an abbreviation.
edu.northwestern.at.morphadorner.tools.relemmatize
Update lemmata and standard spellings in MorphAdorned XML files.
edu.northwestern.at.morphadorner.tools.sampletextfile
Utilities to extract random or exact size samples from a text file.
edu.northwestern.at.morphadorner.tools.stripwordattributes
Create derived MorphAdorner file with word elements stripped of attributes.
edu.northwestern.at.morphadorner.tools.tagdiff
Compares training data to adorner output.
edu.northwestern.at.morphadorner.tools.taggertrainer
Training programs for part of speech taggers.
edu.northwestern.at.morphadorner.tools.taggertrainer.ngram
Generates transition matrices from training data for hidden Markov model part of speech taggers.
edu.northwestern.at.morphadorner.tools.tcp
The tcp package contains utilities aimed at processing Text Creation Partnership texts.
edu.northwestern.at.morphadorner.tools.unadorn
Unadorn removes word level adornments from adorned files.
edu.northwestern.at.morphadorner.tools.validatexmlfiles
Validate XML files.
edu.northwestern.at.morphadorner.tools.xmltotab
Utilities to convert MorphAdorned XML files to tab-separated tabular form.
edu.northwestern.at.morphadorner.xgtagger
Supervises adornment of XML texts.
edu.northwestern.at.utils
Reusable utilities, primarily non-visual.
edu.northwestern.at.utils.cache
Cache utilities.
edu.northwestern.at.utils.csv
Reading and writing delimiter separated files.
edu.northwestern.at.utils.db.mysql
Classes for databases using MySQL.
edu.northwestern.at.utils.html
Utilities for processing HTML text.
edu.northwestern.at.utils.logger
Logging utilities.
edu.northwestern.at.utils.math
Reusable utilities for mathematics and arithmetic.
edu.northwestern.at.utils.math.distributions
Methods for computing point probabilities and percentage points of common statistical distributions.
edu.northwestern.at.utils.math.randomnumbers
Implements the Mersenne Twister random number generator as well as methods for generating random numbers from a variety of statistical distributions.
edu.northwestern.at.utils.math.rootfinders
Methods and interfaces for finding roots (zeroes) of functions.
edu.northwestern.at.utils.math.statistics
Reusable utilities for statistics.
edu.northwestern.at.utils.net.mime
MIME utilities.
edu.northwestern.at.utils.preprocessor
A java comment-based source preprocessor.
edu.northwestern.at.utils.servlets
Reusable utilities for servlets.
edu.northwestern.at.utils.spellcheck
Provides classes and methods for accessing spelling dictionaries and performing spell checking.
edu.northwestern.at.utils.spellcheck.tools
Programs to create spelling dictionaries for use with the spellcheck classes.
edu.northwestern.at.utils.xml
Reusable XML utilities.
edu.northwestern.at.utils.xml.jdom
Reusable JDOM XML utilities.
jargs.gnu
Jargs GNU Command Line Parser.
net.sf.jlinkgrammar
JLinkGrammar is a Java port of the Carnergie Mellon University link grammar parser, a syntactic parser for English.
org.jdom2.contrib.schema
Schema.