edu.northwestern.at.utils.corpuslinguistics.spellingstandardizer
Class RemoteSpellingStandardizer

java.lang.Object
  extended by edu.northwestern.at.utils.corpuslinguistics.spellingstandardizer.RemoteSpellingStandardizer
All Implemented Interfaces:
SpellingStandardizer, UsesLogger

public class RemoteSpellingStandardizer
extends java.lang.Object
implements SpellingStandardizer, UsesLogger

Remote Spelling Standardizer.

This spelling standardizer uses RMI calls to a spelling standardizer to find standardized spellings.


Field Summary
protected  Logger logger
          Logger used for output.
protected  StandardizerServerSession session
          The spelling standarizer server session, or null if none.
 
Constructor Summary
RemoteSpellingStandardizer()
          Create spelling standardizer that uses a remote standardizer server.
 
Method Summary
 void addMappedSpelling(java.lang.String alternateSpelling, java.lang.String standardSpelling)
          Add a mapped spelling.
 void addStandardSpelling(java.lang.String standardSpelling)
          Add a standard spelling.
 void addStandardSpellings(java.util.Collection standardSpellings)
          Add standard spellings from a collection.
 void close()
          Close standardizer.
 java.lang.String fixCapitalization(java.lang.String spelling, java.lang.String standardSpelling)
          Fix capitalization of standardized spelling.
 Logger getLogger()
          Get the logger.
 TaggedStrings getMappedSpellings()
          Return the spelling map.
 int getNumberOfAlternateSpellings()
          Returns number of alternate spellings.
 int[] getNumberOfAlternateSpellingsByWordClass()
          Returns number of alternate spellings by word class.
 int getNumberOfStandardSpellings()
          Returns number of standard spellings.
 java.util.Set<java.lang.String> getStandardSpellings()
          Return the standard spellings.
protected  void initializeServerSession()
          Initializes the server session.
 void loadAlternativeSpellings(java.io.Reader reader, java.lang.String delimChars)
          Loads alternative spellings from a reader.
 void loadAlternativeSpellings(java.net.URL url, java.lang.String encoding, java.lang.String delimChars)
          Loads alternate spellings from a URL.
 void loadAlternativeSpellingsByWordClass(java.net.URL spellingsURL, java.lang.String encoding)
          Load alternate to standard spellings by word class.
 void loadStandardSpellings(java.io.Reader reader)
          Loads standard spellings from a reader.
 void loadStandardSpellings(java.net.URL url, java.lang.String encoding)
          Loads standard spellings from a URL.
 java.lang.String preprocessSpelling(java.lang.String spelling)
          Preprocess spelling.
 void setLogger(Logger logger)
          Set the logger.
 void setMappedSpellings(TaggedStrings standardMappedSpellings)
          Sets map which maps alternate spellings to standard spellings.
 void setStandardSpellings(java.util.Set<java.lang.String> standardSpellings)
          Sets standard spellings.
 java.lang.String[] standardizeSpelling(java.lang.String spelling)
          Returns standard spellings given a spelling.
 java.lang.String standardizeSpelling(java.lang.String spelling, java.lang.String wordClass)
          Returns a standard spelling given a standard or alternate spelling.
 java.lang.String toString()
          Return standardizer description.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

session

protected StandardizerServerSession session
The spelling standarizer server session, or null if none.


logger

protected Logger logger
Logger used for output.

Constructor Detail

RemoteSpellingStandardizer

public RemoteSpellingStandardizer()
Create spelling standardizer that uses a remote standardizer server.

Method Detail

initializeServerSession

protected void initializeServerSession()
Initializes the server session.


loadAlternativeSpellings

public void loadAlternativeSpellings(java.net.URL url,
                                     java.lang.String encoding,
                                     java.lang.String delimChars)
                              throws java.io.IOException
Loads alternate spellings from a URL.

Specified by:
loadAlternativeSpellings in interface SpellingStandardizer
Parameters:
url - URL containing alternate spellings to standard spellings mappings.

Does nothing in this implementation.

encoding - Character set encoding for spellings
delimChars - Delimiter characters separating spelling pairs
Throws:
java.io.IOException

loadAlternativeSpellingsByWordClass

public void loadAlternativeSpellingsByWordClass(java.net.URL spellingsURL,
                                                java.lang.String encoding)
                                         throws java.io.IOException
Load alternate to standard spellings by word class.

Specified by:
loadAlternativeSpellingsByWordClass in interface SpellingStandardizer
Parameters:
spellingsURL - URL of alternative spellings by word class.

Does nothing in this implementation.

encoding - Character set encoding for spellings
Throws:
java.io.IOException

loadAlternativeSpellings

public void loadAlternativeSpellings(java.io.Reader reader,
                                     java.lang.String delimChars)
                              throws java.io.IOException
Loads alternative spellings from a reader.

Specified by:
loadAlternativeSpellings in interface SpellingStandardizer
Parameters:
reader - The reader.
delimChars - Delimiter characters separating spelling pairs.

Does nothing in this implementation.

Throws:
java.io.IOException

loadStandardSpellings

public void loadStandardSpellings(java.net.URL url,
                                  java.lang.String encoding)
                           throws java.io.IOException
Loads standard spellings from a URL.

Specified by:
loadStandardSpellings in interface SpellingStandardizer
Parameters:
url - URL containing standard spellings
encoding - Character set encoding for spellings
Throws:
java.io.IOException

loadStandardSpellings

public void loadStandardSpellings(java.io.Reader reader)
                           throws java.io.IOException
Loads standard spellings from a reader.

Specified by:
loadStandardSpellings in interface SpellingStandardizer
Parameters:
reader - The reader.
Throws:
java.io.IOException

addMappedSpelling

public void addMappedSpelling(java.lang.String alternateSpelling,
                              java.lang.String standardSpelling)
Add a mapped spelling.

Specified by:
addMappedSpelling in interface SpellingStandardizer
Parameters:
alternateSpelling - The alternate spelling.
standardSpelling - The corresponding standard spelling.

addStandardSpelling

public void addStandardSpelling(java.lang.String standardSpelling)
Add a standard spelling.

Specified by:
addStandardSpelling in interface SpellingStandardizer
Parameters:
standardSpelling - A standard spelling.

addStandardSpellings

public void addStandardSpellings(java.util.Collection standardSpellings)
Add standard spellings from a collection.

Specified by:
addStandardSpellings in interface SpellingStandardizer
Parameters:
standardSpellings - A collection of standard spellings.

Does nothing in this implementation.


setMappedSpellings

public void setMappedSpellings(TaggedStrings standardMappedSpellings)
Sets map which maps alternate spellings to standard spellings.

Specified by:
setMappedSpellings in interface SpellingStandardizer
Parameters:
standardMappedSpellings - TaggedStrings with alternate spellings as keys and standard spellings as tag values.

Does nothing in this implementation.


setStandardSpellings

public void setStandardSpellings(java.util.Set<java.lang.String> standardSpellings)
Sets standard spellings.

Specified by:
setStandardSpellings in interface SpellingStandardizer
Parameters:
standardSpellings - Set of standard spellings.

Does nothing in this implementation.


standardizeSpelling

public java.lang.String[] standardizeSpelling(java.lang.String spelling)
Returns standard spellings given a spelling.

Specified by:
standardizeSpelling in interface SpellingStandardizer
Parameters:
spelling - The spelling.
Returns:
The standard spellings as an array of String.

standardizeSpelling

public java.lang.String standardizeSpelling(java.lang.String spelling,
                                            java.lang.String wordClass)
Returns a standard spelling given a standard or alternate spelling.

Specified by:
standardizeSpelling in interface SpellingStandardizer
Parameters:
spelling - The spelling.
wordClass - The word class.
Returns:
The standard spelling.

getNumberOfAlternateSpellings

public int getNumberOfAlternateSpellings()
Returns number of alternate spellings.

Specified by:
getNumberOfAlternateSpellings in interface SpellingStandardizer
Returns:
The number of alternate spellings.

getNumberOfAlternateSpellingsByWordClass

public int[] getNumberOfAlternateSpellingsByWordClass()
Returns number of alternate spellings by word class.

Specified by:
getNumberOfAlternateSpellingsByWordClass in interface SpellingStandardizer
Returns:
int array with two entries. [0] = The number of alternate spellings word classes. [1] = The number of alternate spellings in the word classes.

getNumberOfStandardSpellings

public int getNumberOfStandardSpellings()
Returns number of standard spellings.

Specified by:
getNumberOfStandardSpellings in interface SpellingStandardizer
Returns:
The number of standard spellings.

getMappedSpellings

public TaggedStrings getMappedSpellings()
Return the spelling map.

Specified by:
getMappedSpellings in interface SpellingStandardizer
Returns:
Null since this implementation does not use a local map.

getStandardSpellings

public java.util.Set<java.lang.String> getStandardSpellings()
Return the standard spellings.

Specified by:
getStandardSpellings in interface SpellingStandardizer
Returns:
Always null.

preprocessSpelling

public java.lang.String preprocessSpelling(java.lang.String spelling)
Preprocess spelling.

Specified by:
preprocessSpelling in interface SpellingStandardizer
Parameters:
spelling - Spelling to preprocess.
Returns:
Preprocessed spelling.

Unused in this standardizer.


fixCapitalization

public java.lang.String fixCapitalization(java.lang.String spelling,
                                          java.lang.String standardSpelling)
Fix capitalization of standardized spelling.

Specified by:
fixCapitalization in interface SpellingStandardizer
Parameters:
spelling - The original spelling.
standardSpelling - The candidate standard spelling.
Returns:
Standard spelling with initial capitalization matching original spelling.

Unused in this standardizer.


close

public void close()
Close standardizer.


getLogger

public Logger getLogger()
Get the logger.

Specified by:
getLogger in interface UsesLogger
Returns:
The logger.

setLogger

public void setLogger(Logger logger)
Set the logger.

Specified by:
setLogger in interface UsesLogger
Parameters:
logger - The logger.

toString

public java.lang.String toString()
Return standardizer description.

Overrides:
toString in class java.lang.Object
Returns:
Standardizer description.