NU IT
Northwestern University Information Technology
MorphAdorner Northwestern
 
Adding unclear attributes to words with gaps

AddUnclear adds a type="unclear" attribute to tokens containing character gaps in tokenized or adorned TEI XML files.

Usage:

addunclear outputdirectory input1.xml input2.xml ...

where

  • outputdirectory is the output directory containing the resultant XML files with type="unclear" attributes added to tokens containing character gaps.

  • input*.xml are the input tokenized XML files.

Character gaps in tokens are indicated by the presence of the unicode black circle (\u25CF) in a token.

Home
 
Announcements and News
 
Documentation
 
Download MorphAdorner
 
Glossary
 
Helpful References
 
Licenses
 
Server
 
Talks
 
Tech Talk