|
Literary texts are filled with names of people and places.
MorphAdorner includes a simple name recognizer for extracting names to
allow building lists of characters and geographical settings. MorphAdorner
uses a simple noun phrase pattern recognition method to locate probable names
in a text. This is not a highly accurate procedure but it provides
a useful baseline for further refinement.
MorphAdorner's name extraction process is as follows.
- Assign parts of speech to each word in the text.
- Locate noun phrases, e.g., the longest series of nouns
bracketed by non-nouns.
- Assume noun phrases containing at least one proper noun are names.
Distinguishing a personal name from a location isn't so simple.
MorphAdorner uses lists of proper names and place names, but there is
considerable overlap between these in English.
Even a human reader might have trouble determining in some cases
whether a name refers to a place or a person. In the sentence
"Chester provided arms for the mercenaries", does this refer to the
Earl, the county, or another person named Chester? Even in
context it might be impossible to be sure which is the referent.
You can try MorphAdorner's
default name extractor online which uses
the simple noun phrase method described above.
|