public class Frequency
extends java.lang.Object
Modifier | Constructor and Description |
---|---|
protected |
Frequency()
Don't allow instantiation but do allow overrides.
|
Modifier and Type | Method and Description |
---|---|
static double[] |
logLikelihoodFrequencyComparison(int sampleCount,
int refCount,
int sampleSize,
int refSize)
Compute log-likelihood statistic for comparing frequencies in two corpora.
|
static double[] |
logLikelihoodFrequencyComparison(int sampleCount,
int refCount,
int sampleSize,
int refSize,
boolean computeLLSig)
Compute log-likelihood statistic for comparing frequencies in two corpora.
|
protected Frequency()
public static double[] logLikelihoodFrequencyComparison(int sampleCount, int refCount, int sampleSize, int refSize, boolean computeLLSig)
sampleCount
- Count of word/lemma appearance in sample.refCount
- Count of word/lemma appearance in reference
corpus.sampleSize
- Total words/lemmas in the sample.refSize
- Total words/lemmas in reference corpus.computeLLSig
- Compute significance of log likelihood.The contents of the result array are as follows.
(0) Count of word/lemma appearance in sample.
(1) Percent of word/lemma appearance in sample.
(2) Count of word/lemma appearance in reference.
(3) Percent of word/lemma appearance in reference.
(4) Log-likelihood measure.
(5) Significance of log-likelihood.
The results of any zero divides are set to zero.
public static double[] logLikelihoodFrequencyComparison(int sampleCount, int refCount, int sampleSize, int refSize)
sampleCount
- Count of word/lemma appearance in sample.refCount
- Count of word/lemma appearance in reference
corpus.sampleSize
- Total words/lemmas in the sample.refSize
- Total words/lemmas in reference corpus.The contents of the result array are as follows.
(0) Count of word/lemma appearance in sample.
(1) Percent of word/lemma appearance in sample.
(2) Count of word/lemma appearance in reference.
(3) Percent of word/lemma appearance in reference.
(4) Log-likelihood measure.
(5) Significance of log-likelihood.