public class SentenceAndTokenOffsets
extends java.lang.Object
Usage:
java -Xmx256m edu.northwestern.at.morphadorner.example.SentenceAndTokenOffsets InputFileName
where "InputFileName" specifies the name of a text file to split into sentences and word tokens. The default sentence splitter, tokenizer, part of speech guesser, and word and suffix lexicons are used.
Example:
java -Xmx256m edu.northwestern.at.morphadorner.example.AdornAString mytext.txt
The output displays each extracted sentence along with its starting and ending offset in the text read from the specified input file. For each sentence, a list of the extracted tokens in that sentence is displayed along with each token's starting and ending offset relative to the start of the sentence text.
Constructor and Description |
---|
SentenceAndTokenOffsets() |
Modifier and Type | Method and Description |
---|---|
static void |
displayOffsets(java.lang.String inputFileName)
Display sentence and token offsets in text.
|
static void |
main(java.lang.String[] args)
Main program.
|
public static void main(java.lang.String[] args)
args
- Command line arguments.public static void displayOffsets(java.lang.String inputFileName) throws java.lang.Exception
inputFileName
- Input file name.java.lang.Exception