Poets that lasting marble seek,
Must carve in Latin or in Greek.
We write in sand, our language grows,
And like the tide, our work o'erflows.

-- Edmund Waller



Northwestern
MorphAdorner
    INFORMATION TECHNOLOGY  
    MorphAdorner Site Map  
MorphAdorner > Documentation > Adorning Named Entities
 
Home
 
Announcements and News
 
Download MorphAdorner
 
Documentation
 
Licenses
 
Glossary
 
Helpful References
 
Tech Talk
 

Language Recognizer
 
Lemmatizer
 
Lexicon Lookup
 
Name Recognizer
 
Parser
 
Part of Speech Tagger
 
Pluralizer
 
Sentence Splitter
 
Spelling Standardizer
 
Text Segmenter
 
Verb Conjugator
 
Word Tokenizer
 
  Adorning Named Entities
 
 

AdornWithNamedEntities adorns XML texts with named entities such as person, location, time, date, and organization. It is an experimental procedure based upon the Gate named entity extractor ANNIE with a few modifications to improve its utility for literary purposes.

Usage:

adornwithnamedentities outputdirectory input1.xml input2.xml ...

where

  • outputdirectory -- output directory to receive xml files adorned with named entities.
  • input*.xml -- input TEI XML files.

The named entity adorner does not always recognize entities which cross soft tags. Thus "Emma Woodhouse" may be recognized as two separate entities. AdornedWithNamedEntities should be run on the input files before their submission to MorphAdorner.

Gate uses the following XML tags for marking named entities. AdornWithNamedEntities maps these to the TEI Analytics "<rs>" with a specific type= attribute value.

Gate TEI Analytics
<Date> for a date <rs type="date">
<Location> for a location <rs type="location">
<Money> for an amount of money <rs type="money">
<Organization> for an organization <rs type="organization">
<Person> for a person <rs type="person">
<Time> for a time <rs type="time">

Gate seems to generate "Date" where one might expect "Time" to appear.

In addition to the named entity types generated by Gate, AdornWithNamedEntities can also generate <rs type="literary"> for literary references. This has not been fully implemented.

 

Information Technology | Academic Technologies | Scholarly Technologies 2East Resource Center |
Northwestern Home | Calendar: Plan-It Purple | Sites A-Z | Search
Academic Technologies  NU Library 2East  1970 Campus Drive  Evanston, IL 60208
E-mail: pib@northwestern.edu
Last updated Sun Mar 15 05:52:32 2009   World Wide Web Disclaimer and University Policy Statements   © 2007, 2008 Northwestern University