CompareStringCounts (MorphAdorner)

java.lang.Object
- edu.northwestern.at.morphadorner.tools.comparestringcounts.CompareStringCounts

```
public class CompareStringCounts
extends java.lang.Object
```
Compare string counts in two files using Dunning's log-likelihood.
Usage:
```
  java edu.northwestern.at.morphadorner.tools.comparestringcounts.CompareStringCounts analysis.tab reference.tab
  
```
analysis.tab -- Input tab-separated file of strings and counts for an analysis text.
reference.tab -- Input tab-separated file of strings and counts for a reference text.

The analysis.tab and reference.tab files contain strings and counts of those strings compiled from two texts or corpora. Both files contain two tab-separated columns. The first column is a string. The second column contains the count of the number of times that string occurred in the associated text.

The output contains seven tab-separated columns, sorted in descending order by log-likelihood value. One line of output appears for each string in the analysis text.
1. The first column contains the string. This may be a spelling, a lemma, a part of speech, a spelling bigram, or any other string of interest.
2. The second column contains a "+" when the string is overused in the analysis text with respect to the reference text, a "-" when the string is underused, and a blank when the string is used the same amount in both texts.
3. The third column contains Dunning's log-likelihood value.
4. The fourth column shows the relative frequency of occurrence of the string in the analysis text as fractional parts per ten thousand.
5. The fifth column shows the relative frequency of occurrence of the string in the reference text as fractional parts per ten thousand.
6. The sixth column shows the number of times the string occurred in the analysis text.
7. The seventh column shows the number of times the string occurred in the reference text.
These results are written to the standard output file which can be redirected to another file. A brief summary of the analysis is written to the standard error file. Errors in the input files are also written to the standard error file.

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static class CompareStringCounts.ReverseScoredString
ScoredString modified to sort results from highest to lowest.

Nested Classes
Modifier and Type	Class and Description
`static class`	`CompareStringCounts.ReverseScoredString` ScoredString modified to sort results from highest to lowest.

Constructor Summary

Constructors
Constructor and Description

CompareStringCounts(java.lang.String[] args)
Supervises comparing string counts in two files.

Constructors
Constructor and Description
`CompareStringCounts(java.lang.String[] args)` Supervises comparing string counts in two files.

Method Summary

Methods
Modifier and Type	Method and Description
`static void`	`displayResults(java.util.Map<CompareStringCounts.ReverseScoredString,double[]> results)` Displays results of frequency analysis in a sorted table.
`static void`	`displayUsage()` Display brief program usage.
`static double[]`	`doFreq(java.lang.String stringToAnalyze, int analysisCount, int analysisTotalCount, int refCount, int refTotalCount)` Frequency comparison of analysis and reference works for a word.
`static void`	`main(java.lang.String[] args)` Main program.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - CompareStringCounts
```
public CompareStringCounts(java.lang.String[] args)
```
    Supervises comparing string counts in two files.
    
    Parameters:
    args - Command line arguments.
- Method Detail
  - main
```
public static void main(java.lang.String[] args)
```
    Main program.
  - displayUsage
```
public static void displayUsage()
```
    Display brief program usage.
  - doFreq
```
public static double[] doFreq(java.lang.String stringToAnalyze,
              int analysisCount,
              int analysisTotalCount,
              int refCount,
              int refTotalCount)
```
    Frequency comparison of analysis and reference works for a word.
    
    Parameters:
    stringToAnalyze - The word to analyze.
    analysisCount - Count of word in analysis text.
    analysisTotalCount - Total number of words in analysis text.
    refCount - Count of collocate in reference text.
    refTotalCount - Total number of words in reference text.
    
    Returns:
    Results of frequency analysis as a double[] array.
    The entries in the results array are as follows.
    
    (0) Count of string occurrence in analysis text.
    (1) String occurrence in analysis text as parts per 10,000.
    (2) Count of string occurrence in reference text.
    (3) String occurrence in reference text as parts per 10,000.
    (4) Dunning's Log-likelihood value.
  - displayResults
```
public static void displayResults(java.util.Map<CompareStringCounts.ReverseScoredString,double[]> results)
```
    Displays results of frequency analysis in a sorted table.
    
    Parameters:
    results - The map of results to display.

Class CompareStringCounts

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

CompareStringCounts

Method Detail

main

displayUsage

doFreq

displayResults