public class Names
extends java.lang.Object
Uses lists of first names, surnames, and geographic locations to extract person names and locations from text.
Based in part on code written by Mark Watson.
Modifier and Type | Field and Description |
---|---|
protected static java.util.Set<java.lang.String> |
connectorsSet
Name connectors set.
|
protected static java.lang.String |
defaultResourcePath
Default name resource data files.
|
protected static java.util.Set<java.lang.String> |
firstNameSet
First name set.
|
protected static java.util.Map<java.lang.String,java.lang.String> |
placeNameMap
Place name map.
|
protected static java.util.Set<java.lang.String> |
prefixSet
Prefix title set.
|
protected static java.util.Set<java.lang.String> |
surnameSet
Surname set.
|
Constructor and Description |
---|
Names()
Create name extractor.
|
Names(java.lang.String resourcePath)
Create name extractor.
|
Modifier and Type | Method and Description |
---|---|
protected boolean |
acceptName(Lexicon lexicon,
java.lang.String name,
boolean firstWord,
int numWords)
Accept a name.
|
java.util.Set<java.lang.String> |
getConnectors()
Return name connectors set.
|
java.util.Set<java.lang.String> |
getFirstNames()
Return first name set.
|
java.lang.String |
getPersonName(java.lang.String[] words,
int startIndex,
int numWords)
Get a person name from a list of words.
|
java.lang.String |
getPlaceName(java.lang.String[] words,
int startIndex,
int numWords)
Get a place name from a list of words.
|
java.util.Map<java.lang.String,java.lang.String> |
getPlaceNames()
Return place name set.
|
java.lang.String |
getPlaceNameType(java.lang.String placeName)
Get place name type.
|
java.util.Set<java.lang.String> |
getPrefixes()
Return prefix title set.
|
java.util.Set<java.lang.String>[] |
getProperNames(java.util.List<java.lang.String> wordsList,
Lexicon lexicon)
Extract all proper names for people and places from a sstring.
|
java.util.Set<java.lang.String>[] |
getProperNames(java.lang.String[] words,
Lexicon lexicon)
Extract all proper names for people and places from a list of words.
|
java.util.Set<java.lang.String>[] |
getProperNames(java.lang.String s,
Lexicon lexicon)
Extract all proper names for people and places from a sstring.
|
java.util.Set<java.lang.String> |
getSurnames()
Return last name set.
|
boolean |
isNameOrPlace(java.lang.String s)
See if string is a name or a place.
|
boolean |
isNamePrefix(java.lang.String word)
Check if word is a name prefix (Mr., Mrs., etc.).
|
boolean |
isPersonName(java.lang.String s)
Check if string is a person name.
|
boolean |
isPersonName(java.lang.String[] words)
Check if list of words form a person's name.
|
boolean |
isPlaceName(java.lang.String name)
Check if name is a place name.
|
protected static java.lang.String defaultResourcePath
protected static java.util.Set<java.lang.String> surnameSet
protected static java.util.Set<java.lang.String> firstNameSet
protected static java.util.Map<java.lang.String,java.lang.String> placeNameMap
protected static java.util.Set<java.lang.String> prefixSet
protected static java.util.Set<java.lang.String> connectorsSet
public Names()
public Names(java.lang.String resourcePath)
resourcePath
- Path to resource files.public boolean isNameOrPlace(java.lang.String s)
s
- The string to check.protected boolean acceptName(Lexicon lexicon, java.lang.String name, boolean firstWord, int numWords)
lexicon
- Word lexicon.name
- The text of the name.firstWord
- True if the name starts with the first word
in a sentence.numWords
- The number of words in the name.public java.util.Set<java.lang.String>[] getProperNames(java.lang.String[] words, Lexicon lexicon)
words
- String array of words to search for names.
This should correspond to a single sentence.lexicon
- Lexicon for filtering names.public java.util.Set<java.lang.String>[] getProperNames(java.lang.String s, Lexicon lexicon)
s
- String to search for names.
This should correspond to a single sentence.lexicon
- Lexicon for filtering names.public java.util.Set<java.lang.String>[] getProperNames(java.util.List<java.lang.String> wordsList, Lexicon lexicon)
wordsList
- List of words to search for names.
This should correspond to a single sentence.lexicon
- Lexicon for filtering names.public java.lang.String getPlaceName(java.lang.String[] words, int startIndex, int numWords)
words
- String array of words.startIndex
- Start index in words array to check for a name.numWords
- The number of words to check for a name.public java.lang.String getPlaceNameType(java.lang.String placeName)
placeName
- The place name.public boolean isPlaceName(java.lang.String name)
name
- The name.public boolean isNamePrefix(java.lang.String word)
word
- The word to check.public boolean isPersonName(java.lang.String s)
s
- The string.public java.lang.String getPersonName(java.lang.String[] words, int startIndex, int numWords)
words
- String array of words.startIndex
- Start index in words array to check for a name.numWords
- The number of words to check for a name.public boolean isPersonName(java.lang.String[] words)
words
- The words.public java.util.Set<java.lang.String> getFirstNames()
public java.util.Set<java.lang.String> getSurnames()
public java.util.Map<java.lang.String,java.lang.String> getPlaceNames()
public java.util.Set<java.lang.String> getPrefixes()
public java.util.Set<java.lang.String> getConnectors()