AdornedToTCF04 converts one or more adorned files to the Text Corpus Format (TCF) v0.4 used by the CLARIN-D project.
See: Description
Class | Description |
---|---|
AdornedToTCF04 |
Converts adorned files to TCF 0.4 format.
|
AdornedToTCF04.MyToken |
AdornedToTCF04 converts one or more adorned files to the Text Corpus Format (TCF) v0.4 used by the CLARIN-D project.
Usage:
adornedtotcf04 outputdirectory adorned1.xml adorned2.xml ...
where
The Text Corpus Format (TCF) is used by the European CLARIN-D project to allow interchange of corpora among different web-based services. TCF is an XML-based format which consists of a plain text representation of a work along with a series of annotation layers.
AdornedToTCF04 converts one or more MorphAdorned TEI XML files to TCF format. The text (without tags) is extracted and output, along with the following annotation layers: