Package edu.northwestern.at.morphadorner.tools.adornedtotcf

AdornedToTCF04 converts one or more adorned files to the Text Corpus Format (TCF) v0.4 used by the CLARIN-D project.

See: Description

Package edu.northwestern.at.morphadorner.tools.adornedtotcf Description

AdornedToTCF04 converts one or more adorned files to the Text Corpus Format (TCF) v0.4 used by the CLARIN-D project.

Usage:

adornedtotcf04 outputdirectory adorned1.xml adorned2.xml ...

where

The Text Corpus Format (TCF) is used by the European CLARIN-D project to allow interchange of corpora among different web-based services. TCF is an XML-based format which consists of a plain text representation of a work along with a series of annotation layers.

AdornedToTCF04 converts one or more MorphAdorned TEI XML files to TCF format. The text (without tags) is extracted and output, along with the following annotation layers: