Jump to ContentJump to Main Navigation
Image to InterpretationAn Intelligent System to Aid Historians in Reading the Vindolanda Texts$

Melissa Terras

Print publication date: 2006

Print ISBN-13: 9780199204557

Published to Oxford Scholarship Online: September 2007

DOI: 10.1093/acprof:oso/9780199204557.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy). Subscriber: null; date: 26 February 2017

(p.200) APPENDIX B Annotation

(p.200) APPENDIX B Annotation

Source:
Image to Interpretation
Publisher:
Oxford University Press

(p.200) APPENDIX B

Annotation

Appendix B Annotation

B.1. Encoding Scheme

The manual encoding scheme of additional tags is as follows. The tags are added, where necessary, to the ‘comments’ field of each annotated region.

These shortened tags were chosen for speed in annotation: it will be possible to expand these through simple textual transformation to make the comments noted for each individual stroke and character more human readable.

Direction (up, down, left, right) was judged from the central point of the individual character, on the base line. Assigning these directions was dependent on stroke movement away from this central point, as judged by the annotator.

Each character should have

Letter identification (*)

Overall letter size (S)

Each stroke should have (in this order, separated by a comma)

Stroke direction (D)

Stroke length (L)

Stroke width (W)

Place on line (P)

Each stroke meeting should have

Angle (A)

These individual fields are expanded, below.

Letter identification (*)

Letter (*a, *b, etc)

If unidentified (*?)

If expected, but not present, (*!)

Overall letter size (S)

Height (SH)

Large (SHl)

Average (SHa)

Small (SHs)

(p.201) Width (SW)

Large (SWl)

Average (SWa)

Small (SWs)

Direction of stroke (D)

Straight(DS)

Down Left (DSdl)

Down Right (DSdr)

Up Left (DSul)

Up Right (DSur)

Horizontal (DSh)

Vertical (DSv)

Curved (DC)

Simple Curve (DCS)

Down to Left (DCSdl)

Curve Left (DCSdlcl)

Curve Right (DCSdlcr)

Down to Right (DCSdr)

Curve Left (DCSdrcl)

Curve Right (DCSdrcr)

Up to Left (DCSul)

Curve Left (DCSulcl)

Curve Right (DCSulcr)

Up to Right (DCSur)

Curve Left (DCSurcl)

Curve Right (DCSurcr)

Double Curve (Wave) (DCW)

Down to Left (DCWdl)

Curve Left (DCWdlcl)

Up (DCWdlclu)

Down (DCWdlcld)

Curve Right (DCWdlcr)

Up (DCWdlcru)

Down (DCWdlcrd)

Down to Right (DCWdr)

Curve Left (DCWdrcl)

Up (DCWdrclu)

(p.202) Down (DCWdrcld)

Curve Right (DCWdrcr)

Up (DCWdrcru)

Down (DCWdrcrd)

Up to Left (DCWul)

Curve Left (DCWulcl)

Up (DCWulclu)

Down (DCWulcld)

Curve Right (DCWulcr)

Up (DCWulcru)

Down (DCWulcrd)

Up to Right (DCWur)

Curve Left (DCWurcl)

Up (DCWurclu)

Down (DCWurcld)

Curve Right (DCWurcr)

Up (DCWurcru)

Down (DCWurcru)

Horizontal (DCWh)

Curve Left (DCWhcl)

Up (DCWhclu)

Down (DCWhcld)

Curve Right (DCWhcr)

Up (DCWhcru)

Down (DCWhcrd)

Vertical (DCWv)

Curve Left (DCWvcl)

Up (DCWvclu)

Down (DCWvcld)

Curve Right (DCWvcr)

Up (DCWvcru)

Down (DCWvcrd)

Loop (DL)

To Left (DLl)

Open (DLlo)

Closed (DLlc)

To Right (DLr)

Open (DLro)

Closed (DLrc)

(p.203) Stroke Length (L)

Comparative

Short (Ls)

Average (La)

Long (Ll)

Stroke Width (W)

Comparative

Thin (Wt)

Average (Wa)

Wide (Ww)

Place on Line (P)

Within Line Average (Pw)

Descender (PD)

Below Left (PDl)

Below Right (PDr)

Ascender (PA)

Above Left (PAl)

Above Right (PAr)

Stroke Meeting Angle (A)

Open to Top (AT)

Obtuse (ATo)

Right (ATr)

Acute (ATa)

Open to Bottom (AB)

Obtuse (ABo)

Right (ABr)

Acute (ABa)

Open to Left (AL)

Obtuse (ALo)

Right (ALr)

Acute (ALa)

Open to Right (AR)

Obtuse (ARo)

Right (ARr)

(p.204) Acute (ARa)

Crossing (AC)

Right Angle (ACr)

Compressed Vertical (ACv)

Compressed Horizontal (ACh)

Perpendicular (AP)

B.2. File Format

Each annotation is preserved in XML file format. The annotation file for each image contains a single annotation tag GTAnnotations, which encapsulates all of the regions in the file, with the following attributes:

  • imageName: The name of the source image file that this file annotates.

  • author: The name of the last person to update the annotation file.

  • creationDate: The date and time that the annotation file was initially created.

  • modificationDate: The date and time that the annotation file was last modified.

Each individual region that is annotated is represented by a GTRegion tag which has the following attributes:

  • author: The name of the person who created or last modified this region.

  • regionType: An identifier (such as ‘R26’) that identifies the labelling of the region. The mapping of region types to human readable names as specified in the region dictionary file: these labels are explained below (S. A.3).

  • regionUID: A unique region identifier (such as ‘RGN93’) that names the region.

  • regionDate: The date and time that the region was created or last modified.

  • co‐ordinates: A list of pairs of numbers that represent the co‐ordinates of points along the boundary of the regions. The boundary of the region is defined to be the region contained within the region formed by drawing straight lines between these points.

  • comments: Additional tags manually added (S. A.1) to describe each stroke/region further.

(This text is an updated version of that found in Robertson 2001: Appendix C.1.)

An example of a full XML encoding of a character is given below. This text describes the letter S, as shown in Figure 3.16, first defining a character box (ADCHAR0), which gives the coordinates of the character box, which relate in pixels to the image specified in the imageName element. Additional (p.205) comments are added to this character box to specify that it is the letter S, and of a large height (*s, SHl). The two individual strokes are identified (SO1, SO2: this numbering is related to the order of strokes that the scribe was expected to make, although this cannot be certain) and each of these is given an abstracted description regarding their direction, length, and width in the comments field. Stroke ends are then identified: SE4 (Stroke ending with a hook up to the left), and SE1 (Stroke ending bluntly), and the junction is then marked (SMJ3: stroke meeting, cross meet where the strokes cross each other slightly at the ends) with an extra comment to indicate that the meeting is open to the right at an obtuse angle (ARo). A full table of all of these codes used is given in section B.3. Comments included in the file, below, are those which correspond to the list above (S. B.1).

<GTAnnotations imageName=“C:\grava\311l.tif” author = “Melissa Terras” creationDate = “09/10/02 15:23:58” modificationDate = “09/10/02 15:27:30”>

<GTRegion author = “Melissa Terras” regionType = “ADCHAR0” regionUID = “RGN0” regiondate = “09/10/02 15:24:36” coordinates = “338, 22, 298, 4, 186, 30, 106, 62, 94, 196, 36, 356, 46, 408, 140, 420, 184, 288, 198, 106, 282, 68, 338, 22” comments = “*s, SHl, SWl”></GTRegion>

<GTRegion author = “Melissa Terras” regionType = “SO1” regionUID = “RGN1” regiondate = “09/10/02 15:25:22” coordinates = “170, 76, 162, 184, 152, 276, 124, 350, 102, 382, 74, 372” comments = “DSdl, Ll, Wa, PDl”></GTRegion>

<GTRegion author = “Melissa Terras” regionType = “SO2” regionUID = “RGN2” regiondate = “09/10/02 15:26:11” coordinates = “146, 94, 286, 18” comments = “DSur, La, Wa, PAr”></GTRegion>

<GTRegion author = “Melissa Terras” regionType = “SE4” regionUID = “RGN3” regiondate = “09/10/02 15:26:38” coordinates = “62, 354, 94, 354, 94, 386, 62, 386, 62, 354”></GTRegion>

<GTRegion author = “Melissa Terras” regionType = “SE1” regionUID = “RGN4” regiondate = “09/10/02 15:26:44” coordinates = “162, 62, 178, 62, 178, 88, 162, 88, 162, 62”></GTRegion>

<GTRegion author = “Melissa Terras” regionType = “SE1” regionUID = “RGN5” regiondate = “09/10/02 15:26:50” coordinates = “134, 80, 154, 80, 154, 104, 134, 104, 134, 80”></GTRegion>

<GTRegion author = “Melissa Terras” regionType = “SE1” regionUID = “RGN6” regiondate = “09/10/02 15:26:56” coordinates = “270, 14, 294, 14, 294, 36, 270, 36, 270, 14”></GTRegion>

(p.206) <GTRegion author = “Melissa Terras” regionType = “SMJ3” regionUID = “RGN7” regiondate = “09/10/02 15:27:12” coordinates = “154, 70, 184, 70, 184, 98, 154, 98, 154, 70” comments = “ARo”></GTRegion>

</GTAnnotations>

B.3. Region Type Identifiers

Each region is given a region type identifier, to specify whether it is a type of character, stroke, end point, or junction. These identifiers are specified in Table B.1.

Table B.1. Region identifier codes.

Identifier

Definition

ADCHAR0

Character Box

ADCHAR1

Space Character

ADCHAR2

Paragraph Character

ADCHAR3

Interpunct

SO1

Stroke—First Stroke

SO2

Stroke—Second Stroke

SO3

Stroke—Third Stroke

SO4

Stroke—Fourth Stroke

SO5

Stroke—Fifth Stroke

SO6

Stroke—Sixth Stroke

SO7

Stroke—Seventh Stroke

SE1

Stroke End—Blunt

SE2

Stroke End—Hook—Down Left

SE3

Stroke End—Hook—Down Right

SE4

Stroke End—Hook—Up Left

SE5

Stroke End—Hook—Up Right

SE6

Stroke End—Ligature—To Left— Down

SE7

Stroke End—Ligature—To Left—Up

SE8

Stroke End—Ligature—To Right—Down

SE9

Stroke End—Ligature—To Right— Up

SE10

Stroke End—Serif—To Left— Down

SE11

Stroke End—Serif—To Left—Up

SE12

Stroke End—Serif—To Right— Down

SE13

Stroke End—Serif—To Right—Up

SMJ1

Stroke Meeting—Close Meet

SMJ2

Stroke Meeting—Exact Meet

SMJ3

Stroke Meeting—Cross Meet

SMJ4

Stroke Meeting—Midpoint—Close Meet

SMJ5

Stroke Meeting—Midpoint—Exact Meet

SMJ6

Stroke Meeting—Midpoint—Cross Meet

SMJ7

Stroke Meeting—Crossing

(p.207) B.4. Viewing the Annotated Corpus

The annotated corpus can be easily viewed using a Java‐enabled web browser (preferably Netscape Version 6.0 or above). These annotated images will eventually be presented on Vindolanda tablets online (hosted by the Centre for the Study of Ancient Documents, St Giles, Oxford, see http://vindolanda.csad.ox.ac.uk/). More work must be done on the user interface to make this a useful tool for papyrologists and palaeographers. Additionally, transformations will be applied to the XML files to expand the abbreviated comments into a more human readable form. The comments could then be used for further palaeographic analysis of the letter forms contained within the Vindolanda texts (see Terras and Robertson 2004: 409) for a demonstration of how a comparison of the comments can add to our understanding of the letter forms. This system has the potential to be a rich training and research set for palaeographers and papyrologists.

The user is presented with a list of all 110 annotated images (Figure B.1). The images of the documents are split into line by line sections, to allow for easier annotation. The last number in the name of the file indicates the line it represents in that document. By double clicking on one of these files, the annotated section becomes visible. (Figure B.2). It is possible to toggle the annotations on and off to compare the underlying images to the annotations. (Figure B.3).

(p.208)

APPENDIX B 
Annotation

Figure B.1. Opening Vindolanda Image Corpus Screen, as seen through the browser.

(p.209)

APPENDIX B 
Annotation

Figure B.2. Annotated section of stylus tablet 974, line 12, showing ‘NUTR’ and the related features highlighted. See Figure 2.4 for an image of the complete text.

(p.210)

APPENDIX B 
Annotation

Figure B.3. Section of tablet shown above, with annotations toggled off. This allows the user to switch between their interpretation of the text, and that implied by the markup.