NU
IT
Northwestern University Information Technology |
MorphAdorner V2.0 | Site Map |
Service name: | textsegmenter |
Service description: | Break up a text into thematically meaningful segments. |
HTTP methods allowed: | GET, POST, OPTIONS |
POST accepts as input: | application/x-www-form-urlencoded |
HTTP return codes: | 200: service succeeded 400: service failed with an error |
Query parameters |
|
corpusConfig | Corpus configuration name. In the standard distribution these are ece, eme, and ncf. |
c99MaskSize | The C99 mask size. The default value is 11. |
c99SegmentsWanted | The C99 value for the number text segmented wanted. The default value is -1, which lets the algorithm determine the number of segments. |
includeInputText | Allowed values are true to include the input text in the output and false to not include the input text. |
media | Result format. One of json, xml, html, text . |
segmenterName | Text segmenter method name. The allowed values are C99 and Text Tiling. Text tiling is the default. |
text | Text to be processed. |
tilerSlidingWindowSize | The sliding window size for the Text Tiling algorithm. The default value is 10. |
tilerStepSize | The Text Tiling step size. The default value is 100. |
<form accept-charset="UTF-8" method="post" action="textsegmenter" target="_blank" name="segmenter"> <table cellpadding="0" cellspacing="5"> <tr> <td><strong>Text:</strong></td> <td colspan="2"> <textarea name="text" rows="15" cols="76"></textarea> </td> </tr> <tr> <td> </td> <td> <input type="checkbox" name="includeInputText" value="true" checked="checked"/> Include input text in results </td> </tr> <tr> <td> </td> <td> </td> </tr> <tr> <td valign="top"> <strong> Lexicon:</strong> </td> <td> <input type="radio" name="corpusConfig" value="eme">Early Modern English</input><br /> <input type="radio" name="corpusConfig" value="ece">Eighteen Century English</input><br /> <input type="radio" name="corpusConfig" value="ncf" checked="checked">Nineteenth Century Fiction</input> </td> </tr> <tr> <td> </td> <td> </td> </tr> <tr> <td valign="top"> <strong> Segmenter:</strong> </td> <td> <input type="radio" name="segmenterName" value="C99">C99</input><br /> <table border="0"> <tr> <td> </td> <td> Mask size: </td> <td> <input type="text" name="c99MaskSize" size="5" value="11" /></input> </td> </tr> <tr> <td> </td> <td> Segments desired: </td> <td> <input type="text" name="c99SegmentsWanted" size="5" value="-1" /></input> </td> </tr> </table> <input type="radio" name="segmenterName" value="Text Tiling" checked="checked">Text Tiling</input><br /> <table border="0"> <tr> <td> </td> <td> Sliding window size: </td> <td> <input type="text" name="tilerSlidingWindowSize" size="5" value="10" /></input> </td> </tr> <tr> <td> </td> <td> Segment size: </td> <td> <input type="text" name="tilerStepSize" size="5" value="100" /></input> </td> </tr> </table> </td> </tr> <tr> <td> </td> <td> </td> </tr> <tr> <td valign="top"> <strong>Results format:</strong> </td> <td> <input type="radio" name="media" value="json">JSON format</input><br /> <input type="radio" name="media" value="xml" checked="checked">XML format</input><br /> <input type="radio" name="media" value="html">HTML format</input><br /> <input type="radio" name="media" value="text">Text format</input> </td> </tr> <tr> <td> </td> <td> </td> </tr> <tr> <td colspan="2"> <input type="submit" name="segment" value="Segment" /> </td> </tr> </table> </form>
Here is sample output for the text segmenter service. We use Abraham Lincoln's "Gettysburg Address" as the text. We select the default Text Tiling method.
The JSON and XML output echoes the input values. The sentences container wraps a sequence of sentence entries each of which represents a single parsed sentence from the input text. Each sentence contains a sequence of token entries representing the words and punctuation in the sentence. The segments container wraps a list of integer values specifying the index of the first sentence (0-origin) of each text segment. The segmentTexts wraps a series of segmentText entries each of which provides melded versions of the sentences comprising each text segments, in order. The HTML and text versions provide displayable versions of the text of the segments. The input values are not echoed.
{ "TextSegmenterResult": { "text": "Four score and seven years ago our fathers brought forth on this\r\ncontinent a new nation, conceived in Liberty, and dedicated to\r\nthe proposition that all men are created equal.\r\n\r\nNow we are engaged in a great civil war, testing whether that\r\nnation, or any nation, so conceived and so dedicated, can long\r\nendure. We are met on a great battle-field of that war. We have\r\ncome to dedicate a portion of that field, as a final resting\r\nplace for those who here gave their lives that that nation might\r\nlive. It is altogether fitting and proper that we should do\r\nthis.\r\n\r\nBut, in a larger sense, we can not dedicate -- we can not\r\nconsecrate -- we can not hallow -- this ground. The brave men, living\r\nand dead, who struggled here, have consecrated it, far above our\r\npoor power to add or detract. The world will little note, nor\r\nlong remember what we say here, but it can never forget what\r\nthey did here. It is for us the living, rather, to be dedicated\r\nhere to the unfinished work which they who fought here have thus\r\nfar so nobly advanced. It is rather for us to be here dedicated\r\nto the great task remaining before us -- that from these honored\r\ndead we take increased devotion to that cause for which they\r\ngave the last full measure of devotion -- that we here highly\r\nresolve that these dead shall not have died in vain -- that this\r\nnation, under God, shall have a new birth of freedom -- and that\r\ngovernment: of the people, by the people, for the people, shall\r\nnot perish from the earth.", "corpusConfig": "ncf", "c99MaskSize": 11, "c99SegmentsWanted": -1, "tilerSlidingWindowSize": 10, "tilerStepSize": 100, "sentences": [ { "sentence": [ { "token": [ "Four", "score", "and", "seven", "years", "ago", "our", "fathers", "brought", "forth", "on", "this", "continent", "a", "new", "nation", ",", "conceived", "in", "Liberty", ",", "and", "dedicated", "to", "the", "proposition", "that", "all", "men", "are", "created", "equal", "." ] }, { "token": [ "Now", "we", "are", "engaged", "in", "a", "great", "civil", "war", ",", "testing", "whether", "that", "nation", ",", "or", "any", "nation", ",", "so", "conceived", "and", "so", "dedicated", ",", "can", "long", "endure", "." ] }, { "token": [ "We", "are", "met", "on", "a", "great", "battle-field", "of", "that", "war", "." ] }, { "token": [ "We", "have", "come", "to", "dedicate", "a", "portion", "of", "that", "field", ",", "as", "a", "final", "resting", "place", "for", "those", "who", "here", "gave", "their", "lives", "that", "that", "nation", "might", "live", "." ] }, { "token": [ "It", "is", "altogether", "fitting", "and", "proper", "that", "we", "should", "do", "this", "." ] }, { "token": [ "But", ",", "in", "a", "larger", "sense", ",", "we", "can", "not", "dedicate", "--", "we", "can", "not", "consecrate", "--", "we", "can", "not", "hallow", "--", "this", "ground", "." ] }, { "token": [ "The", "brave", "men", ",", "living", "and", "dead", ",", "who", "struggled", "here", ",", "have", "consecrated", "it", ",", "far", "above", "our", "poor", "power", "to", "add", "or", "detract", "." ] }, { "token": [ "The", "world", "will", "little", "note", ",", "nor", "long", "remember", "what", "we", "say", "here", ",", "but", "it", "can", "never", "forget", "what", "they", "did", "here", "." ] }, { "token": [ "It", "is", "for", "us", "the", "living", ",", "rather", ",", "to", "be", "dedicated", "here", "to", "the", "unfinished", "work", "which", "they", "who", "fought", "here", "have", "thus", "far", "so", "nobly", "advanced", "." ] }, { "token": [ "It", "is", "rather", "for", "us", "to", "be", "here", "dedicated", "to", "the", "great", "task", "remaining", "before", "us", "--", "that", "from", "these", "honored", "dead", "we", "take", "increased", "devotion", "to", "that", "cause", "for", "which", "they", "gave", "the", "last", "full", "measure", "of", "devotion", "--", "that", "we", "here", "highly", "resolve", "that", "these", "dead", "shall", "not", "have", "died", "in", "vain", "--", "that", "this", "nation", ",", "under", "God", ",", "shall", "have", "a", "new", "birth", "of", "freedom", "--", "and", "that", "government", ":", "of", "the", "people", ",", "by", "the", "people", ",", "for", "the", "people", ",", "shall", "not", "perish", "from", "the", "earth", "." ] } ] } ], "segments": [ { "int": [ 0, 5 ] } ], "segmenterName": "Text Tiling", "segmentTexts": [ { "segmentText": [ "Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this. ", "But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us -- that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government: of the people, by the people, for the people, shall not perish from the earth. " ] } ] } }
<TextSegmenterResult> <text>Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this. But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us -- that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government: of the people, by the people, for the people, shall not perish from the earth.</text> <corpusConfig>ncf</corpusConfig> <c99MaskSize>11</c99MaskSize> <c99SegmentsWanted>-1</c99SegmentsWanted> <tilerSlidingWindowSize>10</tilerSlidingWindowSize> <tilerStepSize>100</tilerStepSize> <sentences> <sentence> <token>Four</token> <token>score</token> <token>and</token> <token>seven</token> <token>years</token> <token>ago</token> <token>our</token> <token>fathers</token> <token>brought</token> <token>forth</token> <token>on</token> <token>this</token> <token>continent</token> <token>a</token> <token>new</token> <token>nation</token> <token>,</token> <token>conceived</token> <token>in</token> <token>Liberty</token> <token>,</token> <token>and</token> <token>dedicated</token> <token>to</token> <token>the</token> <token>proposition</token> <token>that</token> <token>all</token> <token>men</token> <token>are</token> <token>created</token> <token>equal</token> <token>.</token> </sentence> <sentence> <token>Now</token> <token>we</token> <token>are</token> <token>engaged</token> <token>in</token> <token>a</token> <token>great</token> <token>civil</token> <token>war</token> <token>,</token> <token>testing</token> <token>whether</token> <token>that</token> <token>nation</token> <token>,</token> <token>or</token> <token>any</token> <token>nation</token> <token>,</token> <token>so</token> <token>conceived</token> <token>and</token> <token>so</token> <token>dedicated</token> <token>,</token> <token>can</token> <token>long</token> <token>endure</token> <token>.</token> </sentence> <sentence> <token>We</token> <token>are</token> <token>met</token> <token>on</token> <token>a</token> <token>great</token> <token>battle-field</token> <token>of</token> <token>that</token> <token>war</token> <token>.</token> </sentence> <sentence> <token>We</token> <token>have</token> <token>come</token> <token>to</token> <token>dedicate</token> <token>a</token> <token>portion</token> <token>of</token> <token>that</token> <token>field</token> <token>,</token> <token>as</token> <token>a</token> <token>final</token> <token>resting</token> <token>place</token> <token>for</token> <token>those</token> <token>who</token> <token>here</token> <token>gave</token> <token>their</token> <token>lives</token> <token>that</token> <token>that</token> <token>nation</token> <token>might</token> <token>live</token> <token>.</token> </sentence> <sentence> <token>It</token> <token>is</token> <token>altogether</token> <token>fitting</token> <token>and</token> <token>proper</token> <token>that</token> <token>we</token> <token>should</token> <token>do</token> <token>this</token> <token>.</token> </sentence> <sentence> <token>But</token> <token>,</token> <token>in</token> <token>a</token> <token>larger</token> <token>sense</token> <token>,</token> <token>we</token> <token>can</token> <token>not</token> <token>dedicate</token> <token>--</token> <token>we</token> <token>can</token> <token>not</token> <token>consecrate</token> <token>--</token> <token>we</token> <token>can</token> <token>not</token> <token>hallow</token> <token>--</token> <token>this</token> <token>ground</token> <token>.</token> </sentence> <sentence> <token>The</token> <token>brave</token> <token>men</token> <token>,</token> <token>living</token> <token>and</token> <token>dead</token> <token>,</token> <token>who</token> <token>struggled</token> <token>here</token> <token>,</token> <token>have</token> <token>consecrated</token> <token>it</token> <token>,</token> <token>far</token> <token>above</token> <token>our</token> <token>poor</token> <token>power</token> <token>to</token> <token>add</token> <token>or</token> <token>detract</token> <token>.</token> </sentence> <sentence> <token>The</token> <token>world</token> <token>will</token> <token>little</token> <token>note</token> <token>,</token> <token>nor</token> <token>long</token> <token>remember</token> <token>what</token> <token>we</token> <token>say</token> <token>here</token> <token>,</token> <token>but</token> <token>it</token> <token>can</token> <token>never</token> <token>forget</token> <token>what</token> <token>they</token> <token>did</token> <token>here</token> <token>.</token> </sentence> <sentence> <token>It</token> <token>is</token> <token>for</token> <token>us</token> <token>the</token> <token>living</token> <token>,</token> <token>rather</token> <token>,</token> <token>to</token> <token>be</token> <token>dedicated</token> <token>here</token> <token>to</token> <token>the</token> <token>unfinished</token> <token>work</token> <token>which</token> <token>they</token> <token>who</token> <token>fought</token> <token>here</token> <token>have</token> <token>thus</token> <token>far</token> <token>so</token> <token>nobly</token> <token>advanced</token> <token>.</token> </sentence> <sentence> <token>It</token> <token>is</token> <token>rather</token> <token>for</token> <token>us</token> <token>to</token> <token>be</token> <token>here</token> <token>dedicated</token> <token>to</token> <token>the</token> <token>great</token> <token>task</token> <token>remaining</token> <token>before</token> <token>us</token> <token>--</token> <token>that</token> <token>from</token> <token>these</token> <token>honored</token> <token>dead</token> <token>we</token> <token>take</token> <token>increased</token> <token>devotion</token> <token>to</token> <token>that</token> <token>cause</token> <token>for</token> <token>which</token> <token>they</token> <token>gave</token> <token>the</token> <token>last</token> <token>full</token> <token>measure</token> <token>of</token> <token>devotion</token> <token>--</token> <token>that</token> <token>we</token> <token>here</token> <token>highly</token> <token>resolve</token> <token>that</token> <token>these</token> <token>dead</token> <token>shall</token> <token>not</token> <token>have</token> <token>died</token> <token>in</token> <token>vain</token> <token>--</token> <token>that</token> <token>this</token> <token>nation</token> <token>,</token> <token>under</token> <token>God</token> <token>,</token> <token>shall</token> <token>have</token> <token>a</token> <token>new</token> <token>birth</token> <token>of</token> <token>freedom</token> <token>--</token> <token>and</token> <token>that</token> <token>government</token> <token>:</token> <token>of</token> <token>the</token> <token>people</token> <token>,</token> <token>by</token> <token>the</token> <token>people</token> <token>,</token> <token>for</token> <token>the</token> <token>people</token> <token>,</token> <token>shall</token> <token>not</token> <token>perish</token> <token>from</token> <token>the</token> <token>earth</token> <token>.</token> </sentence> </sentences> <segments> <int>0</int> <int>5</int> </segments> <segmenterName>Text Tiling</segmenterName> <segmentTexts> <segmentText>Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this. </segmentText> <segmentText>But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us -- that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government: of the people, by the people, for the people, shall not perish from the earth. </segmentText> </segmentTexts> </TextSegmenterResult>
<h3>2 segments found using Text Tiling.</h3> <table border="0"> <tr> <th align="left">Segment</th> <th align="left">Text</th> </tr> <tr> <td valign="top" align="left"><strong>1</strong></td> <td valign="top" align="left">Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.</td> </tr> <tr> <td valign="top" align="left"><strong>2</strong></td> <td valign="top" align="left">But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us -- that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government: of the people, by the people, for the people, shall not perish from the earth.</td> </tr> </table>
Segment | Text |
---|---|
1 | Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this. |
2 | But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us -- that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government: of the people, by the people, for the people, shall not perish from the earth. |
2 segments found using Text Tiling. Segment Text 1 Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this. 2 But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us -- that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government: of the people, by the people, for the people, shall not perish from the earth.
Home | |
Welcome | |
Announcements and News | |
Announcements and news about changes to MorphAdorner | |
Documentation | |
Documentation for using MorphAdorner | |
Download MorphAdorner | |
Downloading and installing the MorphAdorner client and server software | |
Glossary | |
Glossary of MorphAdorner terms | |
Helpful References | |
Natural language processing references | |
Licenses | |
Licenses for MorphAdorner and Associated Software | |
Server | |
Online examples of MorphAdorner Server facilities. | |
Talks | |
Slides from talks about MorphAdorner. | |
Tech Talk | |
Technical information for programmers using MorphAdorner |
Academic Technologies and Research Services,
NU Library 2East, 1970 Campus Drive Evanston, IL 60208. |
Contact Us.
|