com.cybozu.labs.langdetect |
Ngram-based language detection methods for text.
|
com.cybozu.labs.langdetect.util |
Utilities used by the language detection methods.
|
com.megginson.sax |
SAX-based XML output filters.
|
com.rmtheis.langdetect.profile |
Contains the interface for language profiles for the
Cybozu Language Detector as well as the profiles
themselves.
|
edu.northwestern.at.morphadorner |
MorphAdorner adorns texts with word-based morphological information
such as parts of speech and lemmata.
|
edu.northwestern.at.morphadorner.corpuslinguistics.abbreviations |
Abbreviations.
|
edu.northwestern.at.morphadorner.corpuslinguistics.adornedword |
Adorned Word.
|
edu.northwestern.at.morphadorner.corpuslinguistics.apostokens |
Tokens which start or end with apostrophes.
|
edu.northwestern.at.morphadorner.corpuslinguistics.contractionexpander |
Contraction Expander.
|
edu.northwestern.at.morphadorner.corpuslinguistics.hyphenator |
Syllable counter.
|
edu.northwestern.at.morphadorner.corpuslinguistics.inflector |
Inflector.
|
edu.northwestern.at.morphadorner.corpuslinguistics.inflector.conjugator |
Conjugator.
|
edu.northwestern.at.morphadorner.corpuslinguistics.inflector.pluralizer |
Pluralizer.
|
edu.northwestern.at.morphadorner.corpuslinguistics.inflector.wordrule |
WordRule.
|
edu.northwestern.at.morphadorner.corpuslinguistics.inputter |
Text inputter for morphadorner.
|
edu.northwestern.at.morphadorner.corpuslinguistics.languagerecognizer |
Language recognizer.
|
edu.northwestern.at.morphadorner.corpuslinguistics.lemmatizer |
Lemmatization.
|
edu.northwestern.at.morphadorner.corpuslinguistics.lexicon |
Lexicon of spelling, lemmata, and parts of speech.
|
edu.northwestern.at.morphadorner.corpuslinguistics.multiwordunits |
Classes for extracting and manipulating multiword units.
|
edu.northwestern.at.morphadorner.corpuslinguistics.namerecognizer |
Finds named entities in text.
|
edu.northwestern.at.morphadorner.corpuslinguistics.namestandardizer |
Name standardizer.
|
edu.northwestern.at.morphadorner.corpuslinguistics.ngram |
Classes for creating and manipulating word ngrams.
|
edu.northwestern.at.morphadorner.corpuslinguistics.outputter |
Output generation for adorned text.
|
edu.northwestern.at.morphadorner.corpuslinguistics.partsofspeech |
Classes and methods for manipulating part of speech tags.
|
edu.northwestern.at.morphadorner.corpuslinguistics.partsofspeech.mapper |
Classes and methods for mapping one part of speech tag set to another.
|
edu.northwestern.at.morphadorner.corpuslinguistics.phonetics |
Classes for generating phonetic values for strings.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger |
Methods and interfaces for part of speech tagging and lemmatization.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.affix |
Affix part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.allunknown |
All unknown part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.bigram |
Bigram part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.bigramhybrid |
Hybrid bigram part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.guesser |
Guesses parts of speech for unknown words.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.hepple |
Implements Mark Hepple's part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.hepple.rules |
Implements tagging rules for Mark Hepple's part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.iretagger |
Retagger to correct "I" tagging issues.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.noopretagger |
Retagger which leaves initial tagging undisturbed.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.propernounretagger |
Retagger to correct proper noun tagging issues.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.regexp |
Regular expression-based part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.simple |
Simple part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.simplerulebased |
Simple rule-based part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.smoothing.contextual |
Methods and interfaces for lexical and contextual smoothing for part of
speech taggers.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.smoothing.lexical |
Methods and interfaces for lexical and contextual smoothing for part of
speech taggers.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.suffix |
Suffix part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.tcpretagger |
Retagger to correct TCP text issues.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.transitionmatrix |
Transition matrix.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.trigram |
Trigram part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.trigramhybrid |
Hybrid trigram part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.postagger.unigram |
Unigram part of speech tagger.
|
edu.northwestern.at.morphadorner.corpuslinguistics.sentencemelder |
Melds a list of words and punctuation into formatted sentences.
|
edu.northwestern.at.morphadorner.corpuslinguistics.sentencesplitter |
Splits text into sentences.
|
edu.northwestern.at.morphadorner.corpuslinguistics.spellingmapper |
BritishToUS is a simple filter which maps British spellings to American
(US) spellings.
|
edu.northwestern.at.morphadorner.corpuslinguistics.spellingstandardizer |
Spelling standardization.
|
edu.northwestern.at.morphadorner.corpuslinguistics.statistics |
Methods and interfaces for statistical methods useful in corpus linguistics.
|
edu.northwestern.at.morphadorner.corpuslinguistics.stemmer |
Stemming.
|
edu.northwestern.at.morphadorner.corpuslinguistics.stopwords |
Stop words.
|
edu.northwestern.at.morphadorner.corpuslinguistics.stringsimilarity |
Methods for computing the similarity of strings.
|
edu.northwestern.at.morphadorner.corpuslinguistics.syllablecounter |
Syllable counter.
|
edu.northwestern.at.morphadorner.corpuslinguistics.textsegmenter |
Text Segmentation.
|
edu.northwestern.at.morphadorner.corpuslinguistics.textsegmenter.c99 |
C99 text segmentation.
|
edu.northwestern.at.morphadorner.corpuslinguistics.textsegmenter.struct |
Utilities for linear text segmentation.
|
edu.northwestern.at.morphadorner.corpuslinguistics.textsegmenter.texttiling |
Text Tiling text segmentation.
|
edu.northwestern.at.morphadorner.corpuslinguistics.textsummarizer |
Text Summarization.
|
edu.northwestern.at.morphadorner.corpuslinguistics.thesaurus |
Thesaurus.
|
edu.northwestern.at.morphadorner.corpuslinguistics.tokenizer |
Text tokenization.
|
edu.northwestern.at.morphadorner.corpuslinguistics.wordcounts |
Word Counts.
|
edu.northwestern.at.morphadorner.examples |
Example programs using MorphAdorner facilities.
|
edu.northwestern.at.morphadorner.gate |
GATE interfaces for MorphAdorner components.
|
edu.northwestern.at.morphadorner.tei |
Utility classes for TEI XML processing.
|
edu.northwestern.at.morphadorner.tools |
Contains a variety of utility tools for creating and manipulating data files for use with MorphAdorner.
|
edu.northwestern.at.morphadorner.tools.addcharacteroffsets |
Create derived MorphAdorner files with character offsets to word tokens.
|
edu.northwestern.at.morphadorner.tools.addpseudopages |
Adds pseudopage milestones to an adorned file.
|
edu.northwestern.at.morphadorner.tools.adornedtosimpleteip5 |
AdornedToSimpleTEIP5 converts a base-level MorphAdorner file to a
more TEI P5-like format.
|
edu.northwestern.at.morphadorner.tools.adornedtosketch |
AdornedToSketch converts one or more adorned files to the verticalized
input required by the Sketch or NoSketch corpus search engines.
|
edu.northwestern.at.morphadorner.tools.adornedtotcf |
AdornedToTCF04 converts one or more adorned files to the
Text Corpus Format (TCF) v0.4 used by the CLARIN-D project.
|
edu.northwestern.at.morphadorner.tools.annolex |
Utilities for merging Annolex generated corrections with adorned XML files.
|
edu.northwestern.at.morphadorner.tools.applyxslt |
Applies XSLT transformation to one or more files.
|
edu.northwestern.at.morphadorner.tools.compareadornedfiles |
Classes and utilities for comparing token streams
in adorned files and logging the differences to
XML format files.
|
edu.northwestern.at.morphadorner.tools.comparestringcounts |
Compare string counts in two files using Dunning's log-likelihood.
|
edu.northwestern.at.morphadorner.tools.countadornedwords |
Counts adorned words by processing XMLToTab output.
|
edu.northwestern.at.morphadorner.tools.countaffixes |
Counts affixes (suffixes and prefixes) of adorned words by processing MorphAdorner XML output.
|
edu.northwestern.at.morphadorner.tools.createlexicon |
Generates a MorphAdorner lexicon from training data.
|
edu.northwestern.at.morphadorner.tools.createsuffixlexicon |
Generates a MorphAdorner suffix lexicon from a word lexicon.
|
edu.northwestern.at.morphadorner.tools.findteitextlanguage |
Determines the language(s) in which a TEI text is written.
|
edu.northwestern.at.morphadorner.tools.fixquotes |
Fix quote marks in text and XML files.
|
edu.northwestern.at.morphadorner.tools.lgparser |
Link grammar parser driver.
|
edu.northwestern.at.morphadorner.tools.mergebrilllexicon |
Merges Brill style lexicon with MorphAdorned lexicon.
|
edu.northwestern.at.morphadorner.tools.mergeenhancedbrilllexicon |
Merges enhanced Brill style lexicon with MorphAdorned lexicon.
|
edu.northwestern.at.morphadorner.tools.mergespellingdata |
Merges multiple spelling map word lists into a single file.
|
edu.northwestern.at.morphadorner.tools.mergetextfiles |
Merges multiple text files into a single file.
|
edu.northwestern.at.morphadorner.tools.mergewordlists |
Merges multiple word list files into a single file.
|
edu.northwestern.at.morphadorner.tools.namedentities |
AdornWithNamedEntities adorns texts with named entities such as person,
location, time, date, and organization.
|
edu.northwestern.at.morphadorner.tools.punktabbreviationdetector |
PunktAbbreviationDetector uses the Punkt algorithm of
Kiss and Strunk to decide whether a token containing
one or more periods is an abbreviation.
|
edu.northwestern.at.morphadorner.tools.relemmatize |
Update lemmata and standard spellings in MorphAdorned XML files.
|
edu.northwestern.at.morphadorner.tools.sampletextfile |
Utilities to extract random or exact size samples from a text file.
|
edu.northwestern.at.morphadorner.tools.stripwordattributes |
Create derived MorphAdorner file with word elements stripped of attributes.
|
edu.northwestern.at.morphadorner.tools.tagdiff |
Compares training data to adorner output.
|
edu.northwestern.at.morphadorner.tools.taggertrainer |
Training programs for part of speech taggers.
|
edu.northwestern.at.morphadorner.tools.taggertrainer.ngram |
Generates transition matrices from training data for hidden Markov model
part of speech taggers.
|
edu.northwestern.at.morphadorner.tools.tcp |
The tcp package contains utilities aimed at processing Text Creation Partnership texts.
|
edu.northwestern.at.morphadorner.tools.unadorn |
Unadorn removes word level adornments from adorned files.
|
edu.northwestern.at.morphadorner.tools.validatexmlfiles |
Validate XML files.
|
edu.northwestern.at.morphadorner.tools.xmltotab |
Utilities to convert MorphAdorned XML files to tab-separated tabular form.
|
edu.northwestern.at.morphadorner.xgtagger |
Supervises adornment of XML texts.
|
edu.northwestern.at.utils |
Reusable utilities, primarily non-visual.
|
edu.northwestern.at.utils.cache |
Cache utilities.
|
edu.northwestern.at.utils.csv |
Reading and writing delimiter separated files.
|
edu.northwestern.at.utils.db.mysql |
Classes for databases using MySQL.
|
edu.northwestern.at.utils.html |
Utilities for processing HTML text.
|
edu.northwestern.at.utils.logger |
Logging utilities.
|
edu.northwestern.at.utils.math |
Reusable utilities for mathematics and arithmetic.
|
edu.northwestern.at.utils.math.distributions |
Methods for computing point probabilities and percentage points of
common statistical distributions.
|
edu.northwestern.at.utils.math.randomnumbers |
Implements the Mersenne Twister random number generator
as well as methods for generating random numbers from a variety of
statistical distributions.
|
edu.northwestern.at.utils.math.rootfinders |
Methods and interfaces for finding roots (zeroes) of functions.
|
edu.northwestern.at.utils.math.statistics |
Reusable utilities for statistics.
|
edu.northwestern.at.utils.net.mime |
MIME utilities.
|
edu.northwestern.at.utils.preprocessor |
A java comment-based source preprocessor.
|
edu.northwestern.at.utils.servlets |
Reusable utilities for servlets.
|
edu.northwestern.at.utils.spellcheck |
Provides classes and methods for accessing spelling dictionaries and
performing spell checking.
|
edu.northwestern.at.utils.spellcheck.tools |
Programs to create spelling dictionaries for use with the spellcheck
classes.
|
edu.northwestern.at.utils.xml |
Reusable XML utilities.
|
edu.northwestern.at.utils.xml.jdom |
Reusable JDOM XML utilities.
|
jargs.gnu |
Jargs GNU Command Line Parser.
|
net.sf.jlinkgrammar |
JLinkGrammar is a Java port of the Carnergie Mellon University
link grammar parser, a syntactic parser for English.
|
org.jdom2.contrib.schema |
Schema.
|