Coreference Annotation¶
Relations between words that refer to the same referent (anaphora) are expressed in FoLiA using Coreference Annotation. The co-reference relations are expressed by specifying the entire chain in which all links are coreferent.
Specification¶
Annotation Category: | |
---|---|
Declaration: |
|
Version History: | |
since v0.9 |
|
Element: |
|
API Class: |
|
Layer Element: | coreferences |
Span Role Elements: | |
|
|
Required Attributes: | |
Optional Attributes: | |
|
|
Accepted Data: |
|
Valid Context: |
|
Explanation¶
Note
Please first ensure you are familiar with the general principles of Span Annotation to make sense of this annotation type.
Relations between words that refer to the same referent are expressed in FoLiA
using the <coreferencechain>
span annotation element and the <coreferencelink>
span role within it for each instance.
The co-reference relations are expressed by specifying the entire chain in which all links are coreferent.
The head of a coreferent may optionally be marked with the <hd>
element, another span role.
As always, this annotation layer itself may be embedded on whatever level is preferred. The following example uses paragraph level, but you can for instance also embed it at sentence level or a global text level:
The coreferencelink
may take three attributes, which are actually
predefined feature subsets (See Features), their values depend
on the set used and are thus user-definable and never predefined:
modality
- A subset that can be used to indicate that there is modality or negation in this coreference link.time
- A subset used to indicate a time dependency. An example of a time dependency is seen in the sentence: “Bert De Graeve, until recently CEO, will now take up a position as CFO”. Here
“Bert De Graeve”, “CEO” and “CFO” would all be part of the same coreference chain, and the second coreferencelink (“CEO”) can be marked as being in the past using the “time” attribute.
* level
- A subset used that can indicate the level on which the coreference holds. A possible value suggestion could be sense
, indicating that only on sense-level there is a coreference relation, as opposed to an actual reference.
Example¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | <?xml version="1.0" encoding="utf-8"?>
<FoLiA xmlns="http://ilk.uvt.nl/folia" version="2.0" xml:id="example">
<metadata>
<annotations>
<token-annotation set="https://raw.githubusercontent.com/LanguageMachines/uctodata/master/setdefinitions/tokconfig-eng.foliaset.ttl">
<annotator processor="p1" />
</token-annotation>
<text-annotation>
<annotator processor="p1" />
</text-annotation>
<sentence-annotation>
<annotator processor="p1" />
</sentence-annotation>
<paragraph-annotation>
<annotator processor="p1" />
</paragraph-annotation>
<coreference-annotation set="adhoc"> <!-- an ad-hoc set -->
<annotator processor="p1" />
</coreference-annotation>
</annotations>
<provenance>
<processor xml:id="p1" name="proycon" type="manual" />
</provenance>
</metadata>
<text xml:id="example.text">
<p xml:id="example.p.1">
<s xml:id="example.p.1.s.1">
<t>The Dalai Lama greeted him.</t>
<w xml:id="example.p.1.s.1.w.1"><t>The</t></w>
<w xml:id="example.p.1.s.1.w.2"><t>Dalai</t></w>
<w xml:id="example.p.1.s.1.w.3"><t>Lama</t></w>
<w xml:id="example.p.1.s.1.w.4"><t>greeted</t></w>
<w xml:id="example.p.1.s.1.w.5" space="no"><t>him</t></w>
<w xml:id="example.p.1.s.1.w.6"><t>.</t></w>
</s>
<s xml:id="example.p.1.s.2">
<t>He was happy to see him.</t>
<w xml:id="example.p.1.s.2.w.1"><t>He</t></w>
<w xml:id="example.p.1.s.2.w.2"><t>was</t></w>
<w xml:id="example.p.1.s.2.w.3"><t>happy</t></w>
<w xml:id="example.p.1.s.2.w.4"><t>to</t></w>
<w xml:id="example.p.1.s.2.w.5"><t>see</t></w>
<w xml:id="example.p.1.s.2.w.6" space="no"><t>him</t></w>
<w xml:id="example.p.1.s.2.w.7"><t>.</t></w>
</s>
<s xml:id="example.p.1.s.3">
<t>He smiled.</t>
<w xml:id="example.p.1.s.3.w.1"><t>He</t></w>
<w xml:id="example.p.1.s.3.w.2" space="no"><t>smiled</t></w>
<w xml:id="example.p.1.s.3.w.3"><t>.</t></w>
</s>
<coreferences>
<coreferencechain class="dalailama">
<coreferencelink>
<wref id="example.p.1.s.1.w.1" t="The" />
<hd> <!-- extra span role to mark the head -->
<wref id="example.p.1.s.1.w.2" t="Dalai" />
<wref id="example.p.1.s.1.w.3" t="Lama" />
</hd>
</coreferencelink>
<coreferencelink>
<wref id="example.p.1.s.2.w.1" t="he" />
</coreferencelink>
</coreferencechain>
<coreferencechain class="dalailama">
<coreferencelink>
<wref id="example.p.1.s.1.w.5" t="him" />
</coreferencelink>
<coreferencelink>
<wref id="example.p.1.s.2.w.6" t="him" />
</coreferencelink>
<coreferencelink>
<wref id="example.p.1.s.3.w.1" t="He" />
</coreferencelink>
</coreferencechain>
</coreferences>
</p>
</text>
</FoLiA>
|