Alternative Annotation¶
This form of higher-order annotation encapsulates alternative annotations, i.e. annotations that are posed as an alternative option rather than the authoratitive chosen annotation
Specification¶
Annotation Category: | |
---|---|
Declaration: |
|
Version History: | |
Since the beginning, may carry set and classes since v2.0 |
|
Element: |
|
API Class: |
|
Required Attributes: | |
Optional Attributes: | |
|
|
Accepted Data: |
|
Valid Context: |
|
Element: |
|
---|---|
API Class: |
|
Required Attributes: | |
Optional Attributes: | |
|
|
Accepted Data: |
|
Valid Context: |
|
Introduction¶
The FoLiA format does not just allow for a single authoritative annotation per token; it allows the representation of
alternative annotations. There is a specific form for Inline Annotation and a form for Span Annotation;
both share the same declaration <alternative-annotation>
with which a set may be associated.
Alternative Inline Annotation¶
Alternative inline annotations are grouped within one or more <alt>
elements. If multiple annotations are grouped together under the same
<alt>
element, then they are deemed dependent and form a single
set of alternatives.
Each alternative preferably is given a unique identifier. In the following example we see the Dutch word “bank” in the sense of a sofa, alternatively we see two alternative annotations with a different sense and domain.
<w xml:id="example.p.1.s.1.w.1">
<t>bank</t>
<domain class="furniture" />
<sense class="r_n-5918" confidence="1.0">
<desc>furniture</desc>
</sense>
<alt xml:id="example.p.1.s.1.w.1.alt.1">
<domain class="finance" />
<sense class="r_n-5919" confidence="0.6">
<desc>financial institution</desc>
</sense>
</alt>
<alt xml:id="example.p.1.s.1.w.1.alt.2">
<domain class="geology" />
<sense class="r_n-5920" confidence="0.1">
<desc>river bank</desc>
</sense>
</alt>
</w>
Sometimes, an alternative is concerned only with a portion of the annotations. By default, annotations not mentioned are applicable to the alternative as well, unless the alternative is set as being exclusive. Consider the following expanded example in which we added a part-of-speech tag and a lemma.
<w xml:id="example.p.1.s.1.w.1">
<t>bank</t>
<domain class="furniture" />
<sense class="r_n-5918" confidence="1.0">
<desc>furniture</desc>
</sense>
<pos class="n" />
<lemma class="bank" />
<alt xml:id="example.p.1.s.1.w.1.alt.1">
<domain class="finance" />
<sense class="r_n-5919" confidence="0.6">
<desc>financial institution</desc>
</sense>
</alt>
<alt xml:id="example.p.1.s.1.w.1.alt.2">
<domain class="geology" />
<sense class="r_n-5920" confidence="0.1">
<desc>river bank</desc>
</sense>
</alt>
<alt xml:id="example.p.1.s.1.w.1.alt.2" exclusive="yes">
<t>bank</t>
<domain class="navigation" />
<sense class="r_n-1234">
<desc>to turn</desc>
</sense>
<pos class="v" />
<lemma class="bank" />
</alt>
</w>
The first two alternatives are inclusive, which is the default. This means that
the pos tag n
and the lemma bank
apply to them as well. The last
alternative is set as exclusive, using the exclusive
attribute. It has
been given a different pos tag and the lemma and even the text content has
necessarily been repeated even though it is equal to the higher-level annotation,
otherwise there would be no lemma nor text associated with the exclusive
alternative.
Alternatives can be used as a great way of postponing actual annotation, due to their non-authoritative nature. When used in this way, they can be regarded as options. They can be used even when there are no authoritative annotations of the type. Consider the following example in which domain and sense annotations are presented as alternatives and there is no authoritative annotation of these types whatsoever:
<w xml:id="example.p.1.s.1.w.1">
<t>bank</t>
<alt xml:id="example.p.1.s.1.w.1.alt.1">
<domain class="finance" />
<sense class="r_n-5919" confidence="0.6">
<desc>financial institution</desc>
</sense>
</alt>
<alt xml:id="example.p.1.s.1.w.1.alt.2">
<domain class="geology" />
<sense class="r_n-5920" confidence="0.1">
<desc>river bank</desc>
</sense>
</alt>
</w>
Alternative Span Annotation¶
With inline annotations one can specify an unbounded number of alternative annotations. This functionality is available for Span Annotation as well, but due to the different nature of span annotations this happens in a slightly different way.
Where we used <alt>
for token annotations, we now use <altlayers>
for span annotations. Under this element several alternative layers can be
presented. Analogous to <alt>
, any layers grouped together are assumed
to be somehow dependent. Multiple <altlayers>
can be added to introduce
independent alternatives. Each alternative may be associated with a unique
identifier.
Below is an example of a sentence that is chunked in two ways:
<s xml:id="example.p.1.s.1">
<t>The Dalai Lama greeted him.</t>
<w xml:id="example.p.1.s.1.w.1"><t>The</t></w>
<w xml:id="example.p.1.s.1.w.2"><t>Dalai</t></w>
<w xml:id="example.p.1.s.1.w.3"><t>Lama</t></w>
<w xml:id="example.p.1.s.1.w.4"><t>greeted</t></w>
<w xml:id="example.p.1.s.1.w.5"><t>him</t></w>
<w xml:id="example.p.1.s.1.w.6"><t>.</t></w>
<chunking>
<chunk xml:id="example.p.1.s.1.chunk.1">
<wref id="example.p.1.s.1.w.1" t="The" />
<wref id="example.p.1.s.1.w.2" t="Dalai" />
<wref id="example.p.1.s.1.w.3" t="Lama" />
</chunk>
<chunk xml:id="example.p.1.s.1.chunk.2">
<wref id="example.p.1.s.1.w.4" t="greeted" />
</chunk>
<chunk xml:id="example.p.1.s.1.chunk.3">
<wref id="example.p.1.s.1.w.5" t="him" />
<wref id="example.p.1.s.1.w.6" t="." />
</chunk>
</chunking>
<altlayers xml:id="example.p.1.s.1.alt.1">
<chunking>
<chunk xml:id="example.p.1.s.1.alt.1.chunk.1" confidence="0.001">
<wref id="example.p.1.s.1.w.1" t="The" />
<wref id="example.p.1.s.1.w.2" t="Dalai" />
</chunk>
<chunk xml:id="example.p.1.s.1.alt.1.chunk.2" confidence="0.001">
<wref id="example.p.1.s.1.w.2" t="Lama" />
<wref id="example.p.1.s.1.w.4" t="greeted" />
</chunk>
<chunk xml:id="example.p.1.s.1.alt.1.chunk.3" confidence="0.001">
<wref id="example.p.1.s.1.w.5" t="him" />
<wref id="example.p.1.s.1.w.6" t="." />
</chunk>
</chunking>
</altlayers>
</s>
The support for alternatives and the fact that multiple layers (including those of different types) cannot be nested in a single inline structure, should make clear why FoLiA uses a stand-off notation alongside an inline notation.