NU IT
Northwestern University Information Technology
MorphAdorner Northwestern
 
MorphAdorner Server Services: TEI XML Tokenizer Service

Service name: teitokenizer
Service description: Tokenize a TEI XML file.
HTTP methods allowed: POST, OPTIONS
POST accepts as input: application/x-www-form-urlencoded
HTTP return codes: 200: service succeeded
400: service failed with an error

Query parameters

    corpusConfig Corpus configuration name. In the standard distribution these are ece, eme, and ncf.
    media Result format. Only xml allowed.
    teifile TEI input file.
    resultsAsAttachedFile Allowed values are true to send the results as an attached file, and false to send the results as a data stream.

Sample POST form

<form accept-charset="UTF-8" method="post" action="teitokenizer"
      target="_blank"
      enctype="multipart/form-data" name="teitokenizer">
<table cellpadding="0" cellspacing="5">
<tr>
<td>
<strong>TEI XML file:</strong>
</td>
<td>
<input type="file" name="teifile" size="50">
</td>
</tr>
<tr>
<td>
&nbsp;
</td>
<td>
&nbsp;
</td>
</tr>
<tr>
<td>&nbsp;</td>
<td>
<input type="checkbox" name="resultsAsAttachedFile" value="true"
       checked="checked"/>
Send results as attached file
</td>
</tr>
<tr>
<td valign="top">
<strong>
Lexicon:</strong>
</td>
<td>
<input type="radio" name="corpusConfig" value="eme">Early Modern English</input><br />
<input type="radio" name="corpusConfig" value="ece">Eighteen Century English</input><br />
<input type="radio" name="corpusConfig" value="ncf" checked="checked">Nineteenth Century Fiction</input>
</td>
</tr>
<tr>
<td>
&nbsp;
</td>
<td>
&nbsp;
</td>
</tr>
<tr>
<td colspan="2">
<input type="submit" name="tokenize" value="Tokenize" />
</td>
</tr>
</table>
</form>

Output

The input TEI XML file is tokenized and xml:id elements are added to each token. Each token is contained in either a word <w> or a punctuation <pc> element. The output TEI XML is returned either as an attached file if resultsAsAttachedFile is true or as an XML stream if resultsAsAttachedFile is false.

Home
 
Announcements and News
 
Documentation
 
Download MorphAdorner
 
Glossary
 
Helpful References
 
Licenses
 
Server
 
Talks
 
Tech Talk