Comparing Adorned Files

CompareAdornedFiles compares two adorned files and writes a change log indicating the differences between the two.


compareadornedfiles oldadorned.xml newadorned.xml diffs.xml


  • oldadorned.xml is the "old" adorned TEI XML file.

  • newadorned.xml is the "new" (modified) version of the adorned file.

  • diffs.xml is the file name to receive the change log of the token-based differences from the old to the new adorned file.

Change log file format

CompareAdornedFiles uses a simple XML format to contain a list of token-based changes. The format of this file is as follows.

  <changeTime>The time the change file was created.</changeTime>
  <changeDescription>A description of the changes.</changeDescription>
     <id>xml:id of token to be changed.</id>
     <changeType>addition, modification, or deletion.</changeType>
     <fieldType>Type of field to change: text or attribute.</fieldType>
     <oldValue>Old field value.</oldValue>
     <newValue>New field value.</newValue>
     <siblingID>xml:id of sibling word for a word being added.</siblingID>
     <blankPrecedes>true if blank precedes the token, else false.</blankPrecedes>
(more <change> entries)

This simple XML formatted change file allows a file to be transformed to a corrected file using a utility in the MorphAdorner suite. A file can be "untransformed" from the corrected version to the uncorrected version using the same change file. A likely use case for the change log is an edition that wants to use long 's' and other original spellings.

Here is an example of a change log entry which records the replacement of a long "s" with a plain "s".

  <changeTime>2013-07-09 13:04:17.149 CDT</changeTime>
  <changeDescription>Changes from \tokenized\K000379.000.xml to \tokenized-no-word-breaks\K000379.000.xml as determined by CompareAdornedFiles.</changeDescription>

A change log may be used to transform one version of an adorned file into another using the UpdateAdornedFile utility.

