Northwestern University Information Technology
AnnoLex is a collaborative data curation tool for use with Text Creation Partnership texts. Annolex allows for the identification and correction of incompletely or incorrectly transcribed words. It can also be used for the manual correction of algorithmically applied lemmatization and part-of-speech tagging. Annolex was developed by Craig Berry and Martin Mueller.
MergeAnnolexCorrectionsIntoAdornedXML merges corrections developed in Annolex back into the source adorned TEI XML files.
mergeannolexcorrectionsintoadornedxml correctionsdirectory outputdirectory inputfiles
The corrections file is a tab-separated utf-8 file containing the following columns.
The corrected spelling, lemmata, and parts of speech may all be empty when the operation is 3 (delete).
The value of the "ord" (word ordinal) attribute for each word is adjusted to account for inserted and deleted words. The value of the "reg" (standard spelling) and "tok" attributes (original token) are generated as needed for updated and inserted words.
Whitespace markers "
|Announcements and News
|Announcements and news about changes to MorphAdorner
|Documentation for using MorphAdorner
|Downloading and installing the MorphAdorner client and server software
|Glossary of MorphAdorner terms
|Natural language processing references
|Licenses for MorphAdorner and Associated Software
|Online examples of MorphAdorner Server facilities.
|Slides from talks about MorphAdorner.
|Technical information for programmers using MorphAdorner
Academic Technologies and Research Services,
NU Library 2East, 1970 Campus Drive Evanston, IL 60208. |