Northwestern University Information Technology
|MorphAdorner V2.0||Site Map|
This is MorphAdorner v2.0 initially released in September 2013. The online documentation is complete but needs some further edits. A draft of the printable documentation is now available in PDF, EPUB, and MOBI formats.
MorphAdorner is a Java command-line program which acts as a pipeline manager for processes performing morphological adornment of words in a text. We use the term "adornment" in preference to terms such as "annotation" or "tagging" which carry too many alternative and confusing meanings. Adornment harkens back to the medieval sense of manuscript adornment or illumination -- attaching pictures and marginal comments to texts, as the scribal monk at right is doing.
Currently MorphAdorner provides methods for adorning text with standard spellings, parts of speech and lemmata. MorphAdorner also provides facilities for tokenizing text, recognizing sentence boundaries, and extracting names and places. You can find out more about each of these facilities, and see online demonstrations of each, by consulting the documentation section of this web site.
MorphAdorner underwent continuous development in tandem with three projects: WordHoard, Monk, and Virtual Orthographic Standardization and Part of Speech Tagging (VOSPOS), as well as smaller scale faculty research projects at Northwestern University. All three projects are now complete. While MorphAdorner has been used in these projects, it is actually a separate project in its own right.
MorphAdorner saw heavy use in the Monk project. The Monk project sought to adorn a large number of English language texts from the early Modern English period to the start of the twentieth century. The total number of adorned words was about 151.5 million words by project end in April 2009.
Starting in October 2012 we initiated a new MorphAdorner v2.0 project which sought to improve MorphAdorner's processing of several Text Creation Partnership corpora beyond what was attempted during the Monk project. These corpora included the Early English Books Online (EEBO) corpus, the Eighteenth Century Collections Online (ECCO), and the Evans Early American Imprint Collection. You can read more about MorphAdorner's processing of TCP texts.
We improved MorphAdorner's integration with Abbot. Abbot converts dissimilar collections of XML texts into a common interoperable form. Abbot was designed and implemented by Brian L. Pytlik Zillig, Stephen Ramsay, Martin Mueller, and Frank Smutniak.
Our goal in the Abbot and EEBO MorphAdorner collaboration is to turn the TCP texts into the foundation for a "Book of English," defined as:
Please see the modification history for a general overview of the changes from MorphAdorner v1 to v2.
|Announcements and News|
|Announcements and news about changes to MorphAdorner|
|Documentation for using MorphAdorner|
|Downloading and installing the MorphAdorner client and server software|
|Glossary of MorphAdorner terms|
|Natural language processing references|
|Licenses for MorphAdorner and Associated Software|
|Online examples of MorphAdorner Server facilities.|
|Slides from talks about MorphAdorner.|
|Technical information for programmers using MorphAdorner|
Academic Technologies and Research Services,
NU Library 2East, 1970 Campus Drive Evanston, IL 60208. |