Time Segmentation¶
FoLiA supports time segmentation to allow for more fine-grained control of timing information by associating spans of words/tokens with exact timestamps. It can provide a more linguistic alternative to Event Annotation.
Specification¶
Annotation Category: | |
---|---|
Declaration: |
|
Version History: | |
Since v0.8 but renamed since v0.9 |
|
Element: |
|
API Class: |
|
Layer Element: | timing |
Span Role Elements: | |
Required Attributes: | |
Optional Attributes: | |
|
|
Accepted Data: |
|
Valid Context: |
|
Feature subsets (extra attributes): | |
|
Explanation¶
FoLiA supports time segmentation using the <timing>
layer and the
<timesegment>
span annotation element. This element is useful for
speech, but can also be used for event annotation. We already saw events as
structure annotation in Event Annotation, but for more fine-grained
control of timing information a span annotation element in an offset layer is
more suited.
Time segments may also be nested. The predefined and optional
feature subset begindatetime
and enddatetime
can be used express
the exact moment at which an event started or ended. These too are set-defined
so the format shown here is just an example.
If you are only interested in a structural annotation of events, and a coarser level of annotation suffices, then use :ref̋:event_annotation.
If used in a speech context, all the generic speech attributes become available
(See Speech). This introduces begintime
and
endtime
, which are different from the begindatetime
and
enddatetime
feature subsets introduced by this annotation type! The generic attributes begintime
and
endtime
are not defined by a set, but specify a time location in
HH:MM:SS.MMM
format which may refer to the location in an associated audio
file. Audio files are associated using the src
attribute, which is
inherited by all lower elements, so we put it on the sentence here.
See also
Example¶
The following example illustrates the usage of time segmentation for event annotation:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | <?xml version="1.0" encoding="utf-8"?>
<FoLiA xmlns="http://ilk.uvt.nl/folia" version="2.0" xml:id="example">
<metadata>
<annotations>
<token-annotation set="https://raw.githubusercontent.com/LanguageMachines/uctodata/master/setdefinitions/tokconfig-eng.foliaset.ttl">
<annotator processor="p1" />
</token-annotation>
<text-annotation>
<annotator processor="p1" />
</text-annotation>
<sentence-annotation>
<annotator processor="p1" />
</sentence-annotation>
<paragraph-annotation>
<annotator processor="p1" />
</paragraph-annotation>
<timesegment-annotation set="events"> <!-- an ad-hoc set -->
<annotator processor="p1" />
</timesegment-annotation>
</annotations>
<provenance>
<processor xml:id="p1" name="proycon" type="manual" />
</provenance>
</metadata>
<text xml:id="example.text">
<p xml:id="example.p.1">
<s xml:id="example.p.1.s.1">
<w xml:id="example.p.1.s.1.w.1"><t>I</t></w>
<w xml:id="example.p.1.s.1.w.2"><t>think</t></w>
<w xml:id="example.p.1.s.1.w.3"><t>I</t></w>
<w xml:id="example.p.1.s.1.w.4"><t>have</t></w>
<w xml:id="example.p.1.s.1.w.5"><t>to</t></w>
<w xml:id="example.p.1.s.1.w.6"><t>go</t></w>
<w xml:id="example.p.1.s.1.w.7"><t>.</t></w>
<timing>
<timesegment class="utterance" begindatetime="2011-12-15T19:01"
enddatetime="2011-12-15T19:03" actor="myself">
<wref id="example.p.1.s.1.w.1" t="I" />
<wref id="example.p.1.s.1.w.2" t="think" />
</timesegment>
<timesegment class="cough" begindatetime="2011-12-15T19:03"
enddatetime="2011-12-15T19:05" actor="myself">
</timesegment>
<timesegment class="utterance" begindatetime="2011-12-15T19:05"
enddatetime="2011-12-15T19:06" actor="myself">
<wref id="example.p.1.s.1.w.3" t="I" />
<wref id="example.p.1.s.1.w.4" t="have" />
<wref id="example.p.1.s.1.w.5" t="to" />
<wref id="example.p.1.s.1.w.6" t="go" />
</timesegment>
</timing>
</s>
</p>
</text>
</FoLiA>
|
Example in a speech context¶
The following example illustrates the usage of time segmentation in a speech context. You have to be aware though, that
the begintime
and endtime
attributes can also be directly associated with any structure elements in a speech
context, making the use of this annotation type unnecessary or redundant if used this way.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 | <?xml version="1.0" encoding="utf-8"?>
<FoLiA xmlns="http://ilk.uvt.nl/folia" version="2.0" xml:id="example">
<metadata>
<annotations>
<token-annotation set="https://raw.githubusercontent.com/LanguageMachines/uctodata/master/setdefinitions/tokconfig-eng.foliaset.ttl">
<annotator processor="p1" />
</token-annotation>
<text-annotation>
<annotator processor="p1" />
</text-annotation>
<utterance-annotation>
<annotator processor="p1" />
</utterance-annotation>
<timesegment-annotation set="events"> <!-- an ad-hoc set -->
<annotator processor="p1" />
</timesegment-annotation>
</annotations>
<provenance>
<processor xml:id="p1" name="proycon" type="manual" />
</provenance>
</metadata>
<speech xml:id="example.speech">
<utt src="ithinkihavetogo.mp3">
<w xml:id="example.utt.1.w.1"><t>I</t></w>
<w xml:id="example.utt.1.w.2"><t>think</t></w>
<w xml:id="example.utt.1.w.3"><t>I</t></w>
<w xml:id="example.utt.1.w.4"><t>have</t></w>
<w xml:id="example.utt.1.w.5"><t>to</t></w>
<w xml:id="example.utt.1.w.6"><t>go</t></w>
<w xml:id="example.utt.1.w.7"><t>.</t></w>
<timing>
<timesegment begintime="00:00:00.000" endtime="00:00:00.250">
<wref id="example.utt.1.w.1" t="I" />
</timesegment>
<timesegment begintime="00:00:00.250" endtime="00:00:00.500">
<wref id="example.utt.1.w.2" t="think" />
</timesegment>
<timesegment begintime="00:00:00.500" endtime="00:00:00.750">
<wref id="example.utt.1.w.3" t="I" />
</timesegment>
<timesegment begintime="00:00:00.750" endtime="00:00:01.000">
<wref id="example.utt.1.w.4" t="have" />
</timesegment>
<timesegment begintime="00:00:01.000" endtime="00:00:01.250">
<wref id="example.utt.1.w.5" t="to" />
</timesegment>
<timesegment begintime="00:00:01.250" endtime="00:00:01.500">
<wref id="example.utt.1.w.6" t="go" />
</timesegment>
</timing>
</utt>
</speech>
</FoLiA>
|