public class CompareStringCounts
extends java.lang.Object
Usage:
java edu.northwestern.at.morphadorner.tools.comparestringcounts.CompareStringCounts analysis.tab reference.tab
analysis.tab -- Input tab-separated file of strings and counts
for an analysis text.
reference.tab -- Input tab-separated file of strings and counts
for a reference text.
The analysis.tab and reference.tab files contain strings and counts of those strings compiled from two texts or corpora. Both files contain two tab-separated columns. The first column is a string. The second column contains the count of the number of times that string occurred in the associated text.
The output contains seven tab-separated columns, sorted in descending order by log-likelihood value. One line of output appears for each string in the analysis text.
These results are written to the standard output file which can be redirected to another file. A brief summary of the analysis is written to the standard error file. Errors in the input files are also written to the standard error file.
Modifier and Type | Class and Description |
---|---|
static class |
CompareStringCounts.ReverseScoredString
ScoredString modified to sort results from highest to lowest.
|
Constructor and Description |
---|
CompareStringCounts(java.lang.String[] args)
Supervises comparing string counts in two files.
|
Modifier and Type | Method and Description |
---|---|
static void |
displayResults(java.util.Map<CompareStringCounts.ReverseScoredString,double[]> results)
Displays results of frequency analysis in a sorted table.
|
static void |
displayUsage()
Display brief program usage.
|
static double[] |
doFreq(java.lang.String stringToAnalyze,
int analysisCount,
int analysisTotalCount,
int refCount,
int refTotalCount)
Frequency comparison of analysis and reference works for a word.
|
static void |
main(java.lang.String[] args)
Main program.
|
public CompareStringCounts(java.lang.String[] args)
args
- Command line arguments.public static void main(java.lang.String[] args)
public static void displayUsage()
public static double[] doFreq(java.lang.String stringToAnalyze, int analysisCount, int analysisTotalCount, int refCount, int refTotalCount)
stringToAnalyze
- The word to analyze.analysisCount
- Count of word in analysis text.analysisTotalCount
- Total number of words in analysis
text.refCount
- Count of collocate in reference
text.refTotalCount
- Total number of words in reference
text.The entries in the results array are as follows.
(0) Count of string occurrence in analysis text.
(1) String occurrence in analysis text as parts per 10,000.
(2) Count of string occurrence in reference text.
(3) String occurrence in reference text as parts per 10,000.
(4) Dunning's Log-likelihood value.
public static void displayResults(java.util.Map<CompareStringCounts.ReverseScoredString,double[]> results)
results
- The map of results to display.