Relation Annotation¶
FoLiA provides a facility to relate arbitrary parts of your document with other parts of your document, or even with parts of other FoLiA documents or external resources, even in other formats. It thus allows linking resources together. Within this context, the xref
element is used to refer to the linked FoLiA elements.
Specification¶
Annotation Category: | |
---|---|
Declaration: |
|
Version History: | |
Revised since v0.8, renamed from alignment in v2.0 |
|
Element: |
|
API Class: |
|
Required Attributes: | |
Optional Attributes: | |
|
|
Accepted Data: |
|
Valid Context: |
|
Explanation¶
Note
In versions of FoLiA prior to 2.0, this annotation type was called alignments
FoLiA provides a facility to link parts of your document with other parts of your document, or even with parts of other
FoLiA documents or external resources. These are called relations and are implemented using the <relation>
element. Within this context, the <xref>
element is used to cross-link to the related FoLiA elements.
Consider the two following aligned sentences from excerpts of two separate FoLiA documents in different languages:
<s xml:id="example-english.p.1.s.1">
<t>The Dalai Lama greeted him.</t>
<relation class="french-translation" xlink:href="doc-french.xml"
xlink:type="simple">
<xref id="doc-french.p.1.s.1" t="Le Dalai Lama le saluait."
type="s" />
</relation>
</s>
<s xml:id="example-french.p.1.s.1">
<t>Le Dalai Lama le saluait.</t>
<relation class="english-translation" xlink:href="doc-english.xml"
xlink:type="simple">
<xref id="doc-english.p.1.s.1" t="The Dalai Lama greeted him."
type="s" />
</relation>
<relation class="dutch-translation" xlink:href="doc-dutch.xml"
xlink:type="simple">
<xref id="doc-dutch.p.1.s.1" t="De Dalai Lama begroette hem."
type="s" />
</relation>
</s>
It is the job of the <relation>
element to point to the relevant
resource, whereas the <xref>
element points to a specific point
inside the referenced resource. The xlink:href
attribute is
used to link to the target document, if any. If the relation is within the
same document then it should simply be omitted. The type
attribute on
<xref>
specifies the type of element the relation points too, i.e. its
value is equal to the tagname it points to. The t
attribute to the
<xref>
element is merely optional and this overhead is added simply to
facilitate the job of limited FoLiA parsers and provides a quick reference to
the target text for both parsers and human users.
Although the above example has a single relation reference (<xref>
), it
is not forbidden to specify multiple references within the <relation>
block, effectively referring to a span of multiple elements at the target.
By default, relations are between FoLiA documents. It is, however, also
possible to point to resources in different formats. This has to be made
explicit using the format
attribute on the <relation>
element.
The value of the format
attribute is a MIME type and defaults to
text/folia+xml
(naming follows RFC 3032). In the following example
align a section (<div>
) with the original HTML document from which the
FoLiA document is arrived, and where the section is expressed with an HTML anchor
(<a>
) tag.
<div class="section">
<t>lorum ipsum etc.</t>
<relation class="original" xlink:href="http://somewhere/original.html"
xlink:type="simple" format="text/html">
<xref id="section2" type="a" />
</relation>
</div>
See also
For more complex many-to-many relations, see Span Relation Annotation, an extension of the current annotation type.
Translations¶
relation Annotation and Span Relation Annotation are an excellent tool for specifying translations. For situations in which relations seem overkill, a simple multi-document mechanism is available. This mechanism is based purely on convention: It assumes that structural elements that are translations simply share the same ID. This approach is quite feasible when used on higher-level structural elements, such as divisions, paragraphs, events or entries.
Example¶
The following example shows Entity Annotation with relations to Wikipedia.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | <?xml version="1.0" encoding="utf-8"?>
<FoLiA xmlns="http://ilk.uvt.nl/folia" xmlns:xlink="http://www.w3.org/1999/xlink" version="2.0" xml:id="example">
<metadata>
<annotations>
<token-annotation set="https://raw.githubusercontent.com/LanguageMachines/uctodata/master/setdefinitions/tokconfig-eng.foliaset.ttl" format="text/turtle">
<annotator processor="p1" />
</token-annotation>
<text-annotation>
<annotator processor="p1" />
</text-annotation>
<sentence-annotation>
<annotator processor="p1" />
</sentence-annotation>
<paragraph-annotation>
<annotator processor="p1" />
</paragraph-annotation>
<entity-annotation set="https://raw.githubusercontent.com/proycon/folia/master/setdefinitions/namedentities.foliaset.ttl" format="text/turtle">
<annotator processor="p1" />
</entity-annotation>
<relation-annotation set="adhoc">
<annotator processor="p1" />
</relation-annotation>
</annotations>
<provenance>
<processor xml:id="p1" name="proycon" type="manual" />
</provenance>
</metadata>
<text xml:id="example.text">
<p xml:id="example.p.1">
<s xml:id="example.p.1.s.1">
<t>The Dalai Lama currently lives in Dharamsala in India.</t>
<w xml:id="example.p.1.s.1.w.1" class="WORD">
<t>The</t>
</w>
<w xml:id="example.p.1.s.1.w.2" class="WORD">
<t>Dalai</t>
</w>
<w xml:id="example.p.1.s.1.w.3" class="WORD">
<t>Lama</t>
</w>
<w xml:id="example.p.1.s.1.w.4" class="WORD">
<t>currently</t>
</w>
<w xml:id="example.p.1.s.1.w.5" class="WORD">
<t>lives</t>
</w>
<w xml:id="example.p.1.s.1.w.6" class="WORD">
<t>in</t>
</w>
<w xml:id="example.p.1.s.1.w.7" class="WORD">
<t>Dharamsala</t>
</w>
<w xml:id="example.p.1.s.1.w.8" class="WORD">
<t>in</t>
</w>
<w xml:id="example.p.1.s.1.w.9" class="WORD" space="no">
<t>India</t>
</w>
<w xml:id="example.p.1.s.1.w.10" class="PUNCTUATION">
<t>.</t>
</w>
<entities>
<entity xml:id="example.p.1.s.1.entity.1" class="per">
<relation class="wikipedia" xlink:href="https://en.wikipedia.org/wiki/Dalai_Lama" xlink:type="simple" format="text/html" />
<wref id="example.p.1.s.1.w.2" t="Dalai" />
<wref id="example.p.1.s.1.w.3" t="Lama" />
</entity>
<entity xml:id="example.p.1.s.1.entity.2" class="loc.city">
<relation class="wikipedia" xlink:href="https://en.wikipedia.org/wiki/Dharamsala" xlink:type="simple" format="text/html" />
<wref id="example.p.1.s.1.w.7" t="Dharamsala" />
</entity>
<entity xml:id="example.p.1.s.1.entity.3" class="loc.country">
<relation class="wikipedia" xlink:href="https://en.wikipedia.org/wiki/India" xlink:type="simple" format="text/html" />
<wref id="example.p.1.s.1.w.9" t="India" />
</entity>
</entities>
</s>
</p>
</text>
</FoLiA>
|
The following example shows relations within strings in a document (See also String Annotation):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | <?xml version="1.0" encoding="utf-8"?>
<FoLiA xmlns="http://ilk.uvt.nl/folia" version="2.0" xml:id="example">
<metadata>
<annotations>
<text-annotation>
<annotator processor="p1" />
</text-annotation>
<paragraph-annotation>
<annotator processor="p1" />
</paragraph-annotation>
<string-annotation>
<annotator processor="p1" />
</string-annotation>
<relation-annotation>
<annotator processor="p1" />
</relation-annotation>
</annotations>
<provenance>
<processor xml:id="p1" name="proycon" type="manual" />
</provenance>
</metadata>
<text xml:id="example.text">
<p xml:id="example.p.1">
<t><t-str id="example.p.1.str.1">Hello.</t-str> This is a sentence. Bye!</t>
<t class="ocroutput"><t-str id="example.p.1.str.2">Hell0</t-str> Th1s iz a sentence, Bye1</t>
<str xml:id="example.p.1.str.1">
<t offset="0">Hello.</t>
<relation>
<xref id="example.p.1.str.2" type="str" />
</relation>
</str>
<str xml:id="example.p.1.str.2">
<t class="ocroutput" offset="0">Hell0</t>
<relation>
<xref id="example.p.1.str.1" type="str" />
</relation>
</str>
</p>
</text>
</FoLiA>
|