Sentence Annotation¶
Structure annotation representing a sentence. Sentence detection is a common stage in NLP alongside tokenisation.
Specification¶
Annotation Category: | |
---|---|
Declaration: |
|
Version History: | |
Since the beginning |
|
Element: |
|
API Class: |
|
Required Attributes: | |
Optional Attributes: | |
|
|
Accepted Data: |
|
Valid Context: |
|
Explanation & Examples¶
The next example shows a paragraph with sentences and tokenisation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | <?xml version="1.0" encoding="utf-8"?>
<FoLiA xmlns="http://ilk.uvt.nl/folia" version="2.0" xml:id="example">
<metadata>
<annotations>
<token-annotation set="https://raw.githubusercontent.com/LanguageMachines/uctodata/master/setdefinitions/tokconfig-eng.foliaset.ttl">
<annotator processor="p1" />
</token-annotation>
<text-annotation>
<annotator processor="p1" />
</text-annotation>
<sentence-annotation>
<annotator processor="p1" />
</sentence-annotation>
<paragraph-annotation>
<annotator processor="p1" />
</paragraph-annotation>
</annotations>
<provenance>
<processor xml:id="p1" name="proycon" type="manual" />
</provenance>
</metadata>
<text xml:id="example.text">
<p xml:id="example.p.1">
<s xml:id="example.p.1.s.1">
<w xml:id="example.p.1.s.1.w.1" class="WORD">
<t>Hello</t>
</w>
<w xml:id="example.p.1.s.1.w.2" class="WORD" space="no">
<t>World</t>
</w>
<w xml:id="example.p.1.s.1.w.3" class="PUNCTUATION">
<t>!</t>
</w>
</s>
<s xml:id="example.p.1.s.2">
<w xml:id="example.p.1.s.2.w.1" class="WORD">
<t>This</t>
</w>
<w xml:id="example.p.1.s.2.w.2" class="WORD">
<t>is</t>
</w>
<w xml:id="example.p.1.s.2.w.3" class="WORD">
<t>an</t>
</w>
<w xml:id="example.p.1.s.2.w.4" class="WORD" space="no">
<t>example</t>
</w>
<w xml:id="example.p.1.s.2.w.5" class="PUNCTUATION">
<t>.</t>
</w>
</s>
</p>
</text>
</FoLiA>
|