Description Annotation¶

This is a form of higher-order annotation that allows you to associate descriptions with almost all other annotation elements

Specification¶

Annotation Category:
	Higher-order Annotation
Declaration:	`<description-annotation>` *(note: there is never a set associated with this annotation type)
Version History:
	Since the beginning
Element:	`<desc>`
API Class:	`Description` (FoLiApy API Reference)
Required Attributes:

Optional Attributes:
	`xml:id` – The ID of the element; this has to be a unique in the entire document or collection of documents (corpus). All identifiers in FoLiA are of the XML NCName datatype, which roughly means it is a unique string that has to start with a letter (not a number or symbol), may contain numbers, but may never contain colons or spaces. FoLiA does not define any naming convention for IDs. `processor` – This refers to the ID of a processor in the Provenance Data. The processor in turn defines exactly who or what was the annotator of the annotation. `annotator` – This is an older alternative to the `processor` attribute, without support for full provenance. The annotator attribute simply refers to the name o ID of the system or human annotator that made the annotation. `annotatortype` – This is an older alternative to the `processor` attribute, without support for full provenance. It is used together with `annotator` and specific the type of the annotator, either `manual` for human annotators or `auto` for automated systems. `confidence` – A floating point value between zero and one; expresses the confidence the annotator places in his annotation. `datetime` – The date and time when this annotation was recorded, the format is `YYYY-MM-DDThh:mm:ss` (note the literal T in the middle to separate date from time), as per the XSD Datetime data type. `n` – A number in a sequence, corresponding to a number in the original document, for example chapter numbers, section numbers, list item numbers. This this not have to be an actual number but other sequence identifiers are also possible (think alphanumeric characters or roman numerals). `tag` – Contains a space separated list of processing tags associated with the element. A processing tag carries arbitrary user-defined information that may aid in processing a document. It may carry cues on how a specific tool should treat a specific element. The tag vocabulary is specific to the tool that processes the document. Tags carry no instrinsic meaning for the data representation and should not be used except to inform/aid processors in their task. Processors are encouraged to clean up the tags they use. Ideally, published FoLiA documents at the end of a processing pipeline carry no further tags. For encoding actual data, use `class` and optionally features instead.
Accepted Data:	`<comment>` (Comment Annotation), `<desc>` (Description Annotation)
Valid Context:	`<alt>` (Alternative Annotation), `<altlayers>` (Alternative Annotation), `<chunk>` (Chunking), `<chunking>` (Chunking), `<comment>` (Comment Annotation), `<content>` (Raw Content), `<coreferencechain>` (Coreference Annotation), `<coreferences>` (Coreference Annotation), `<coreferencelink>` (Coreference Annotation), `<correction>` (Correction Annotation), `<current>` (Correction Annotation), `<def>` (Definition Annotation), `<dependencies>` (Dependency Annotation), `<dependency>` (Dependency Annotation), `<desc>` (Description Annotation), `<div>` (Division Annotation), `<domain>` (Domain/topic Annotation), `<entities>` (Entity Annotation), `<entity>` (Entity Annotation), `<entry>` (Entry Annotation), `<errordetection>` (Error Detection Annotation (DEPRECATED)), `<event>` (Event Annotation), `<ex>` (Example Annotation), `<external>` (External Annotation), `<figure>` (Figure Annotation), `<gap>` (Gap Annotation), `<head>` (Head Annotation), `<hiddenw>` (Hidden Token Annotation), `<t-hbr>` (Hyphenation), `<lang>` (Language Annotation), `<lemma>` (Lemmatisation), `<br>` (Linebreak), `<list>` (List Annotation), `<metric>` (Metric Annotation), `<modalities>` (Modality Annotation), `<modality>` (Modality Annotation), `<morpheme>` (Morphological Annotation), `<morphology>` (Morphological Annotation), `<new>` (Correction Annotation), `<note>` (Note Annotation), `<observation>` (Observation Annotation), `<observations>` (Observation Annotation), `<original>` (Correction Annotation), `<p>` (Paragraph Annotation), `<part>` (Part Annotation), `<ph>` (Phonetic Annotation/Content), `<phoneme>` (Phonological Annotation), `<phonology>` (Phonological Annotation), `<pos>` (Part-of-Speech Annotation), `<predicate>` (Predicate Annotation), `<quote>` (Quote Annotation), `<ref>` (Reference Annotation), `<relation>` (Relation Annotation), `<semrole>` (Semantic Role Annotation), `<semroles>` (Semantic Role Annotation), `<sense>` (Sense Annotation), `<s>` (Sentence Annotation), `<sentiment>` (Sentiment Annotation), `<sentiments>` (Sentiment Annotation), `<spanrelation>` (Span Relation Annotation), `<spanrelations>` (Span Relation Annotation), `<statement>` (Statement Annotation), `<statements>` (Statement Annotation), `<str>` (String Annotation), `<subjectivity>` (Subjectivity Annotation (DEPRECATED)), `<suggestion>` (Correction Annotation), `<su>` (Syntactic Annotation), `<syntax>` (Syntactic Annotation), `<table>` (Table Annotation), `<term>` (Term Annotation), `<t>` (Text Annotation), `<t-correction>` (Correction Annotation), `<t-error>` (Error Detection Annotation (DEPRECATED)), `<t-gap>` (Gap Annotation), `<t-hspace>` (Horizontal Whitespace), `<t-lang>` (Language Annotation), `<t-ref>` (Reference Annotation), `<t-str>` (String Annotation), `<t-style>` (Style Annotation), `<t-whitespace>` (Vertical Whitespace), `<timesegment>` (Time Segmentation), `<timing>` (Time Segmentation), `<utt>` (Utterance Annotation), `<whitespace>` (Vertical Whitespace), `<w>` (Token Annotation)

Explanation¶

This is one of the simplest forms of higher-order annotation. Any annotation element may hold a desc element containing in its body a human readable description for the annotation. Only one description is allowed per annotation.

Example¶

Consider the following example in the context of Sense Annotation.

<?xml version="1.0" encoding="utf-8"?>
<FoLiA xmlns="http://ilk.uvt.nl/folia" version="2.0" xml:id="example">
  <metadata>
      <annotations>
          <token-annotation set="https://raw.githubusercontent.com/LanguageMachines/uctodata/master/setdefinitions/tokconfig-eng.foliaset.ttl">
			 <annotator processor="p1" />
		  </token-annotation>
          <text-annotation>
			 <annotator processor="p1" />
          </text-annotation>
          <sentence-annotation>
			 <annotator processor="p1" />
          </sentence-annotation>
          <paragraph-annotation>
			 <annotator processor="p1" />
          </paragraph-annotation>
          <sense-annotation set="wordnet"> <!-- an ad-hoc set -->
			 <annotator processor="p1" />
		  </sense-annotation>
          <description-annotation>
			 <annotator processor="p1" />
		  </description-annotation>
      </annotations>
      <provenance>
         <processor xml:id="p1" name="proycon" type="manual" />
      </provenance>
  </metadata>
  <text xml:id="example.text">
    <p xml:id="example.p.1">
      <s xml:id="example.p.1.s.2">
         <t>I show an example.</t>
         <w xml:id="example.p.1.s.2.w.1" class="WORD">
            <t>I</t>
         </w>
         <w xml:id="example.p.1.s.2.w.2" class="WORD">
            <t>show</t>
            <sense class="show%2:39:02::">
				<desc>give an exhibition of to an interested audience</desc>
            </sense>
         </w>
         <w xml:id="example.p.1.s.2.w.3" class="WORD">
            <t>an</t>
         </w>
         <w xml:id="example.p.1.s.2.w.4" class="WORD" space="no">
            <t>example</t>
            <sense class="example%1:09:00::">
                <desc>an item of information that is typical of a class or group)</desc>
            </sense>
         </w>
         <w xml:id="example.p.1.s.2.w.5" class="PUNCTUATION">
            <t>.</t>
         </w>
      </s>
    </p>
  </text>
</FoLiA>