public class FixQuotes
extends java.lang.Object
Usage:
java edu.northwestern.at.morphadorner.tools.fixquotes.FixQuotes input.txt output.txt
input.txt -- input text file.
output.txt -- output text file with quotes fixed.
Based in part on the Perl and PHP "SmartyPants" programs by John Gruber, Brad Choate, and Michel Fortin.
Since the "quotification" relies on heuristics, not all quotes will be converted correctly.
Modifier and Type | Field and Description |
---|---|
protected static java.lang.String |
apos
Apostrophereplacement text.
|
protected static java.lang.String |
dq
Temporary double quote marker.
|
protected static java.lang.String |
ldquo
Left double quote replacement text.
|
protected static java.lang.String |
lsquo
Left single quote replacement text.
|
protected static java.lang.String |
rdquo
Right double quote replacement text.
|
protected static java.lang.String |
rsquo
Right single quote replacement text.
|
protected static java.lang.String |
sq
Temporary single quote marker.
|
Modifier | Constructor and Description |
---|---|
protected |
FixQuotes()
Allow overrides but not instantiation.
|
Modifier and Type | Method and Description |
---|---|
static java.util.regex.Pattern |
buildContractionsPattern(TaggedStrings contractions)
Build contractions pattern.
|
static TaggedStrings |
loadContractions(java.lang.String contractionsURL)
Load list of non-breakable words and contractions.
|
static void |
main(java.lang.String[] args)
Main program.
|
static java.lang.String |
repairQuotes(java.lang.String s,
java.util.regex.Matcher contractionsMatcher,
TaggedStrings contractions)
Fix quotes and apostrophes in text.
|
protected static final java.lang.String lsquo
protected static final java.lang.String ldquo
protected static final java.lang.String rsquo
protected static final java.lang.String rdquo
protected static final java.lang.String apos
protected static final java.lang.String sq
protected static final java.lang.String dq
public static void main(java.lang.String[] args)
public static java.lang.String repairQuotes(java.lang.String s, java.util.regex.Matcher contractionsMatcher, TaggedStrings contractions)
s
- The text to fix.contractionsMatcher
- Matcher for known contractions.The following operations are performed on the input text to generate the fixed output text.
public static TaggedStrings loadContractions(java.lang.String contractionsURL)
public static java.util.regex.Pattern buildContractionsPattern(TaggedStrings contractions)
contractions
- Contractions as a tagged strings list.