NU
IT
Northwestern University Information Technology |
MorphAdorner V2.0 | Site Map |
Service name: | sentencesplitter |
Service description: | Splits plain text into sentences. |
HTTP methods allowed: | GET, POST, OPTIONS |
POST accepts as input: | application/x-www-form-urlencoded |
HTTP return codes: | 200: service succeeded 400: service failed with an error |
Query parameters |
|
corpusConfig | Corpus configuration name. In the standard distribution these are ece, eme, and ncf. |
media | Result format. One of json, xml, html, text . |
text | Text to be processed. |
includeInputText | Allowed values are true to include the input text in the output and false to not include the input text. |
langCode | ISO language code. These are two or three character codes. The default is en, English. You may specify *** Detect *** to indicate that the server should try to determine the language from the text provided. |
<form accept-charset="UTF-8" method="post" action="sentencesplitter" target="_blank" name="sentencesplitter"> <table cellpadding="0" cellspacing="5"> <tr> <td><strong>Text:</strong></td> <td colspan="2"> <textarea name="text" rows="15" cols="76"></textarea> </td> </tr> <tr> <td valign="top"> <strong> Lexicon:</strong> </td> <td> <input type="radio" name="corpusConfig" value="eme">Early Modern English</input><br /> <input type="radio" name="corpusConfig" value="ece">Eighteen Century English</input><br /> <input type="radio" name="corpusConfig" value="ncf" checked="checked">Nineteenth Century Fiction</input> </td> </tr> <tr> <td><strong>Language:</strong></td> <td> <select name="langCode"> <option value="en" selected="selected">English</option> <option value="">*** Detect ***</option> <option value="af">Afrikaans</option> <option value="ak">Akan</option> <option value="sq">Albanian</option> <option value="am">Amharic</option> <option value="ar">Arabic</option> <option value="hy">Armenian</option> <option value="as">Assamese</option> <option value="az">Azerbaijani</option> <option value="bm">Bambara</option> <option value="bas">Basa</option> <option value="eu">Basque</option> <option value="be">Belarusian</option> <option value="bem">Bemba</option> <option value="bn">Bengali</option> <option value="bs">Bosnian</option> <option value="br">Breton</option> <option value="bg">Bulgarian</option> <option value="my">Burmese</option> <option value="ca">Catalan</option> <option value="chr">Cherokee</option> <option value="zh">Chinese</option> <option value="kw">Cornish</option> <option value="hr">Croatian</option> <option value="cs">Czech</option> <option value="da">Danish</option> <option value="dua">Duala</option> <option value="nl">Dutch</option> <option value="eo">Esperanto</option> <option value="et">Estonian</option> <option value="ee">Ewe</option> <option value="ewo">Ewondo</option> <option value="fo">Faroese</option> <option value="fil">Filipino</option> <option value="fi">Finnish</option> <option value="fr">French</option> <option value="ff">Fulah</option> <option value="gl">Gallegan</option> <option value="lg">Ganda</option> <option value="ka">Georgian</option> <option value="de">German</option> <option value="el">Greek</option> <option value="kl">Greenlandic</option> <option value="gu">Gujarati</option> <option value="ha">Hausa</option> <option value="haw">Hawaiian</option> <option value="iw">Hebrew</option> <option value="hi">Hindi</option> <option value="hu">Hungarian</option> <option value="is">Icelandic</option> <option value="ig">Igbo</option> <option value="in">Indonesian</option> <option value="ga">Irish</option> <option value="it">Italian</option> <option value="ja">Japanese</option> <option value="kab">Kabyle</option> <option value="kam">Kamba</option> <option value="kn">Kannada</option> <option value="kk">Kazakh</option> <option value="km">Khmer</option> <option value="ki">Kikuyu</option> <option value="rw">Kinyarwanda</option> <option value="kok">Konkani</option> <option value="ko">Korean</option> <option value="lv">Latvian</option> <option value="ln">Lingala</option> <option value="lt">Lithuanian</option> <option value="lu">Luba-Katanga</option> <option value="mk">Macedonian</option> <option value="mg">Malagasy</option> <option value="ms">Malay</option> <option value="ml">Malayalam</option> <option value="mt">Maltese</option> <option value="gv">Manx</option> <option value="mr">Marathi</option> <option value="mas">Masai</option> <option value="ne">Nepali</option> <option value="nd">North Ndebele</option> <option value="nb">Norwegian Bokm�l</option> <option value="nn">Norwegian Nynorsk</option> <option value="nyn">Nyankole</option> <option value="or">Oriya</option> <option value="om">Oromo</option> <option value="pa">Panjabi</option> <option value="fa">Persian</option> <option value="pl">Polish</option> <option value="pt">Portuguese</option> <option value="ps">Pushto</option> <option value="rm">Raeto-Romance</option> <option value="ro">Romanian</option> <option value="rn">Rundi</option> <option value="ru">Russian</option> <option value="sg">Sango</option> <option value="sr">Serbian</option> <option value="sn">Shona</option> <option value="ii">Sichuan Yi</option> <option value="si">Sinhalese</option> <option value="sk">Slovak</option> <option value="sl">Slovenian</option> <option value="so">Somali</option> <option value="es">Spanish</option> <option value="sw">Swahili</option> <option value="sv">Swedish</option> <option value="gsw">Swiss German</option> <option value="ta">Tamil</option> <option value="te">Telugu</option> <option value="th">Thai</option> <option value="bo">Tibetan</option> <option value="ti">Tigrinya</option> <option value="to">Tonga</option> <option value="tr">Turkish</option> <option value="uk">Ukrainian</option> <option value="ur">Urdu</option> <option value="uz">Uzbek</option> <option value="vai">Vai</option> <option value="vi">Vietnamese</option> <option value="cy">Welsh</option> <option value="yo">Yoruba</option> <option value="zu">Zulu</option> </select> </td> </tr> <tr> <td> </td> <td> <input type="checkbox" name="includeInputText" value="true" checked="checked"/> Include input text in results </td> </tr> <tr> <td> </td> <td> </td> </tr> <tr> <td valign="top"> <strong>Results format:</strong> </td> <td> <input type="radio" name="media" value="json">JSON format</input><br /> <input type="radio" name="media" value="xml" checked="checked">XML format</input><br /> <input type="radio" name="media" value="html">HTML format</input><br /> <input type="radio" name="media" value="text">Text format</input> </td> </tr> <tr> <td> </td> <td> </td> </tr> <tr> <td colspan="2"> <input type="submit" name="split" value="Split" /> </td> </tr> </table> </form>
Here we split a paragraph from Lincolns "Gettysburg Address" into sentences.
Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.
The JSON and XML WordTokenizerResult echo the input text, the ISO language code langCode, and the corpusConfig. The sentences container wraps a sequence of sentence entries each of which represents a single parsed sentence from the input text. Each sentence contains a sequence of token entries representing the words and punctuation in the sentence. The meldedSentences container wraps a sequence of meldedSentence entries each of which contains a single untokenized sentence. The HTML and text versions provide displayable versions of the extracted sentences.
{ "SentenceSplitterResult": { "text": "Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.", "langCode": "en", "corpusConfig": "ncf", "sentences": [ { "sentence": [ { "token": [ "Now", "we", "are", "engaged", "in", "a", "great", "civil", "war", ",", "testing", "whether", "that", "nation", ",", "or", "any", "nation", ",", "so", "conceived", "and", "so", "dedicated", ",", "can", "long", "endure", "." ] }, { "token": [ "We", "are", "met", "on", "a", "great", "battle-field", "of", "that", "war", "." ] }, { "token": [ "We", "have", "come", "to", "dedicate", "a", "portion", "of", "that", "field", ",", "as", "a", "final", "resting", "place", "for", "those", "who", "here", "gave", "their", "lives", "that", "that", "nation", "might", "live", "." ] }, { "token": [ "It", "is", "altogether", "fitting", "and", "proper", "that", "we", "should", "do", "this", "." ] } ] } ], "meldedSentences": [ { "meldedSentence": [ "Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure.", "We are met on a great battle-field of that war.", "We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live.", "It is altogether fitting and proper that we should do this." ] } ] } }
<?xml version="1.0"?> <SentenceSplitterResult> <text>Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. We are met on a great battle-field of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.</text> <langCode>en</langCode> <corpusConfig>ncf</corpusConfig> <sentences> <sentence> <token>Now</token> <token>we</token> <token>are</token> <token>engaged</token> <token>in</token> <token>a</token> <token>great</token> <token>civil</token> <token>war</token> <token>,</token> <token>testing</token> <token>whether</token> <token>that</token> <token>nation</token> <token>,</token> <token>or</token> <token>any</token> <token>nation</token> <token>,</token> <token>so</token> <token>conceived</token> <token>and</token> <token>so</token> <token>dedicated</token> <token>,</token> <token>can</token> <token>long</token> <token>endure</token> <token>.</token> </sentence> <sentence> <token>We</token> <token>are</token> <token>met</token> <token>on</token> <token>a</token> <token>great</token> <token>battle-field</token> <token>of</token> <token>that</token> <token>war</token> <token>.</token> </sentence> <sentence> <token>We</token> <token>have</token> <token>come</token> <token>to</token> <token>dedicate</token> <token>a</token> <token>portion</token> <token>of</token> <token>that</token> <token>field</token> <token>,</token> <token>as</token> <token>a</token> <token>final</token> <token>resting</token> <token>place</token> <token>for</token> <token>those</token> <token>who</token> <token>here</token> <token>gave</token> <token>their</token> <token>lives</token> <token>that</token> <token>that</token> <token>nation</token> <token>might</token> <token>live</token> <token>.</token> </sentence> <sentence> <token>It</token> <token>is</token> <token>altogether</token> <token>fitting</token> <token>and</token> <token>proper</token> <token>that</token> <token>we</token> <token>should</token> <token>do</token> <token>this</token> <token>.</token> </sentence> </sentences> <meldedSentences> <meldedSentence>Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure.</meldedSentence> <meldedSentence>We are met on a great battle-field of that war.</meldedSentence> <meldedSentence>We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live.</meldedSentence> <meldedSentence>It is altogether fitting and proper that we should do this.</meldedSentence> </meldedSentences> </SentenceSplitterResult>
<h3>4 sentences found.</h3> <table border="0"> <tr> <th align="left">S#</th> <th align="left">Sentence</th> </tr> <tr> <td valign="top" align="left"><strong>1</strong></td> <td valign="top" align="left">Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure.</td> </tr> <tr> <td valign="top" align="left"><strong>2</strong></td> <td valign="top" align="left">We are met on a great battle-field of that war.</td> </tr> <tr> <td valign="top" align="left"><strong>3</strong></td> <td valign="top" align="left">We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live.</td> </tr> <tr> <td valign="top" align="left"><strong>4</strong></td> <td valign="top" align="left">It is altogether fitting and proper that we should do this.</td> </tr> </table>
S# | Sentence |
---|---|
1 | Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. |
2 | We are met on a great battle-field of that war. |
3 | We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. |
4 | It is altogether fitting and proper that we should do this. |
4 sentences found. S# Sentence 1 Now we are engaged in a great civil war, testing whether that nation, or any nation, so conceived and so dedicated, can long endure. 2 We are met on a great battle-field of that war. 3 We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. 4 It is altogether fitting and proper that we should do this.
Home | |
Welcome | |
Announcements and News | |
Announcements and news about changes to MorphAdorner | |
Documentation | |
Documentation for using MorphAdorner | |
Download MorphAdorner | |
Downloading and installing the MorphAdorner client and server software | |
Glossary | |
Glossary of MorphAdorner terms | |
Helpful References | |
Natural language processing references | |
Licenses | |
Licenses for MorphAdorner and Associated Software | |
Server | |
Online examples of MorphAdorner Server facilities. | |
Talks | |
Slides from talks about MorphAdorner. | |
Tech Talk | |
Technical information for programmers using MorphAdorner |
Academic Technologies and Research Services,
NU Library 2East, 1970 Campus Drive Evanston, IL 60208. |
Contact Us.
|