edu.northwestern.at.morphadorner.tools.sampletextfile
Class ExactlySampleTextFile

java.lang.Object
  extended by edu.northwestern.at.morphadorner.tools.sampletextfile.SampleTextFile
      extended by edu.northwestern.at.morphadorner.tools.sampletextfile.ExactlySampleTextFile

public class ExactlySampleTextFile
extends SampleTextFile

Exactly sample a text file.

Usage:

java edu.northwestern.at.morphadorner.tools.sampletextfile.ExactlySampleTextFile input.txt output.txt samplecount

input.txt -- input text file to be sampled.
output.txt -- output text file.
samplecount -- Size of random sample to extract. Must be positive integer.

The output file is a text file containing the sampled text lines from the input file. Both the input and the output must be utf-8 encoded. The output lines are appended to any existing lines in the output file.


Field Summary
protected  int sampleCount
          Count of lines left to sample.
protected  int totalCount
          Count of lines left in input file.
 
Constructor Summary
ExactlySampleTextFile(java.lang.String inputFileName, java.lang.String outputFileName, int sample)
          Copy a text file to another while sampling the input lines.
 
Method Summary
static void help()
          Help text.
protected  boolean lineSelected()
          Check if line should be selected.
static void main(java.lang.String[] args)
          Main program.
 boolean samplingDone()
          Determine if sampling done.
protected  void setupSampling(java.lang.String inputFileName, java.lang.String outputFileName, double sample)
          Setup sample.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

totalCount

protected int totalCount
Count of lines left in input file.


sampleCount

protected int sampleCount
Count of lines left to sample.

Constructor Detail

ExactlySampleTextFile

public ExactlySampleTextFile(java.lang.String inputFileName,
                             java.lang.String outputFileName,
                             int sample)
Copy a text file to another while sampling the input lines.

Parameters:
inputFileName - Input file name.
outputFileName - Output file name.
sample - Sample count, percentage, etc.
Method Detail

main

public static void main(java.lang.String[] args)
Main program.

Parameters:
args - Program parameters.

help

public static void help()
Help text.


setupSampling

protected void setupSampling(java.lang.String inputFileName,
                             java.lang.String outputFileName,
                             double sample)
Setup sample.

Specified by:
setupSampling in class SampleTextFile
Parameters:
inputFileName - Input file name.
outputFileName - Output file name.
sample - Sample count, percentage, etc.

lineSelected

protected boolean lineSelected()
Check if line should be selected.

Specified by:
lineSelected in class SampleTextFile
Returns:
true to select line.

samplingDone

public boolean samplingDone()
Determine if sampling done.

Overrides:
samplingDone in class SampleTextFile
Returns:
true if sampling done.