Northwestern University Information Technology
The text of many older works may not be clearly readable because of faded print, ink blotches, foxing, or other degradations of the printed source. Transcribers mark these unreadable sections in digital text copies using special characters or tag sequences. In TEI, the <gap> tag serves to mark sections of a text which cannot be transcribed because of problems in reading the original source.
It may be useful to try to repair individual damaged words by examining which letters appear in the same positions as unreadable letters across a set of related texts. In essense this is the same as trying to find the missing letters in words in crossword puzzles. In some cases there is only a single plausible reconstruction for a damaged word. More often there are several possible reconstructions.
MorphAdorner implements a "gap filler" algorithm which looks at all the words which do not contain gaps in a given lexicon and tries to find potential matches for a word containing individual letter gaps. MorphAdorner uses a trie structure to hold all the words without gaps, which supports fast searches for words contain unknown letters.
You can try MorphAdorner's gap filler online.
|Announcements and News
|Announcements and news about changes to MorphAdorner
|Documentation for using MorphAdorner
|Downloading and installing the MorphAdorner client and server software
|Glossary of MorphAdorner terms
|Natural language processing references
|Licenses for MorphAdorner and Associated Software
|Online examples of MorphAdorner Server facilities.
|Slides from talks about MorphAdorner.
|Technical information for programmers using MorphAdorner
Academic Technologies and Research Services,
NU Library 2East, 1970 Campus Drive Evanston, IL 60208. |