Raw Content

This associates raw text content which can not carry any further annotation. It is used in the context of Gap Annotation

Specification

Annotation Category:
 

Content Annotation

Declaration:

<rawcontent-annotation set="..."> (note: set is optional for this annotation type; if you declare this annotation type to be setless you can not assign classes)

Version History:
 

Since the beginning, but revised and made a proper annotation type in v2.0

Element:

<content>

API Class:

Content (FoLiApy API Reference)

Required Attributes:
 
Optional Attributes:
 
  • set – The set of the element, ideally a URI linking to a set definition (see Set Definitions (Vocabulary)) or otherwise a uniquely identifying string. The set must be referred to also in the Annotation Declarations for this annotation type.
  • class – The class of the annotation, i.e. the annotation tag in the vocabulary defined by set.
  • processor – This refers to the ID of a processor in the Provenance Data. The processor in turn defines exactly who or what was the annotator of the annotation.
  • annotator – This is an older alternative to the processor attribute, without support for full provenance. The annotator attribute simply refers to the name o ID of the system or human annotator that made the annotation.
  • annotatortype – This is an older alternative to the processor attribute, without support for full provenance. It is used together with annotator and specific the type of the annotator, either manual for human annotators or auto for automated systems.
  • confidence – A floating point value between zero and one; expresses the confidence the annotator places in his annotation.
  • datetime – The date and time when this annotation was recorded, the format is YYYY-MM-DDThh:mm:ss (note the literal T in the middle to separate date from time), as per the XSD Datetime data type.
  • tag – Contains a space separated list of processing tags associated with the element. A processing tag carries arbitrary user-defined information that may aid in processing a document. It may carry cues on how a specific tool should treat a specific element. The tag vocabulary is specific to the tool that processes the document. Tags carry no instrinsic meaning for the data representation and should not be used except to inform/aid processors in their task. Processors are encouraged to clean up the tags they use. Ideally, published FoLiA documents at the end of a processing pipeline carry no further tags. For encoding actual data, use class and optionally features instead.
Accepted Data:

<comment> (Comment Annotation), <desc> (Description Annotation)

Valid Context:

<gap> (Gap Annotation)

Explanation

The content element associates raw text content with an element, it is specifically used in the context of Gap Annotation. The content can carry no further annotations.

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
<?xml version="1.0" encoding="utf-8"?>
<FoLiA xmlns="http://ilk.uvt.nl/folia" version="2.0" xml:id="example">
  <metadata>
      <annotations>
          <text-annotation>
			 <annotator processor="p1" />
          </text-annotation>
          <division-annotation set="https://raw.githubusercontent.com/LanguageMachines/uctodata/master/setdefinitions/divisions.foliaset.xml">
			 <annotator processor="p1" />
		  </division-annotation>
          <gap-annotation set="adhoc">
			 <annotator processor="p1" />
		  </gap-annotation>
          <rawcontent-annotation>
			 <annotator processor="p1" />
		  </rawcontent-annotation>
          <description-annotation>
			 <annotator processor="p1" />
		  </description-annotation>
          <paragraph-annotation>
			 <annotator processor="p1" />
		  </paragraph-annotation>
      </annotations>
      <provenance>
         <processor xml:id="p1" name="proycon" type="manual" />
      </provenance>
  </metadata>
  <text xml:id="example.text">
     <gap class="frontmatter">
        <desc>This is the cover of the book</desc>
        <content>
<![CDATA[

            SHOW WHITE AND THE SEVEN DWARFS


                by the Brothers Grimm

                    first edition


            Copyright(c) blah blah
]]>
        </content>
     </gap>
     <div xml:id="example.div.1" class="chapter" n="1">
         <t>In the <t-gap class="illegible" /> there was a princess...</t>
     </div>
  </text>
</FoLiA>