Time Segmentation

FoLiA supports time segmentation to allow for more fine-grained control of timing information by associating spans of words/tokens with exact timestamps. It can provide a more linguistic alternative to Event Annotation.

Specification

Annotation Category:
 

Span Annotation

Declaration:

<timesegment-annotation set="..."> (note: ``set`` is optional for this annotation type)

Version History:
 

Since v0.8 but renamed since v0.9

Element:

<timesegment>

API Class:

TimeSegment

Layer Element:

timing

Span Role Elements:
 
Required Attributes:
 
Optional Attributes:
 
  • xml:id – The ID of the element; this has to be a unique in the entire document or collection of documents (corpus). All identifiers in FoLiA are of the XML NCName datatype, which roughly means it is a unique string that has to start with a letter (not a number or symbol), may contain numers, but may never contain colons or spaces. FoLiA does not define any naming convention for IDs.
  • set – The set of the element, ideally a URI linking to a set definition (see Set Definitions (Vocabulary)) or otherwise a uniquely identifying string. The set must be referred to also in the Annotation Declarations for this annotation type.
  • class – The class of the annotation, i.e. the annotation tag in the vocabulary defined by set.
  • processor – This refers to the ID of a processor in the Provenance Data. The processor in turn defines exactly who or what was the annotator of the annotation.
  • annotator – This is an older alternative to the processor attribute, without support for full provenance. The annotator attribute simply refers to the name o ID of the system or human annotator that made the annotation.
  • annotatortype – This is an older alternative to the processor attribute, without support for full provenance. It is used together with annotator and specific the type of the annotator, either manual for human annotators or auto for automated systems.
  • confidence – A floating point value between zero and one; expresses the confidence the annotator places in his annotation.
  • datetime – The date and time when this annotation was recorded, the format is YYYY-MM-DDThh:mm:ss (note the literal T in the middle to separate date from time), as per the XSD Datetime data type.
  • n – A number in a sequence, corresponding to a number in the original document, for example chapter numbers, section numbers, list item numbers. This this not have to be an actual number but other sequence identifiers are also possible (think alphanumeric characters or roman numerals).
  • textclass – Refers to the text class this annotation is based on. This is an advanced attribute, if not specified, it defaults to current. See Text class attribute (advanced).
  • src – Points to a file or full URL of a sound or video file. This attribute is inheritable.
  • begintime – A timestamp in HH:MM:SS.MMM format, indicating the begin time of the speech. If a sound clip is specified (src); the timestamp refers to a location in the soundclip.
  • endtime – A timestamp in HH:MM:SS.MMM format, indicating the end time of the speech. If a sound clip is specified (src); the timestamp refers to a location in the soundclip.
  • speaker – A string identifying the speaker. This attribute is inheritable. Multiple speakers are not allowed, simply do not specify a speaker on a certain level if you are unable to link the speech to a specific (single) speaker.
Accepted Data:

<comment> (Comment Annotation), <desc> (Description Annotation), <metric> (Metric Annotation), <relation> (Relation Annotation)

Valid Context:

<timing> (Time Segmentation)

Feature subsets (extra attributes):
 
  • actor
  • begindatetime
  • enddatetime

Explanation

FoLiA supports time segmentation using the <timing> layer and the <timesegment> span annotation element. This element is useful for speech, but can also be used for event annotation. We already saw events as structure annotation in Event Annotation, but for more fine-grained control of timing information a span annotation element in an offset layer is more suited.

Time segments may also be nested. The predefined and optional feature subset begindatetime and enddatetime can be used express the exact moment at which an event started or ended. These too are set-defined so the format shown here is just an example.

If you are only interested in a structural annotation of events, and a coarser level of annotation suffices, then use :ref̋:event_annotation.

If used in a speech context, all the generic speech attributes become available (See Speech). This introduces begintime and endtime, which are different from the begindatetime and enddatetime feature subsets introduced by this annotation type! The generic attributes begintime and endtime are not defined by a set, but specify a time location in HH:MM:SS.MMM format which may refer to the location in an associated audio file. Audio files are associated using the src attribute, which is inherited by all lower elements, so we put it on the sentence here.

Example

The following example illustrates the usage of time segmentation for event annotation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
<?xml version="1.0" encoding="utf-8"?>
<FoLiA xmlns="http://ilk.uvt.nl/folia" version="2.0" xml:id="example">
  <metadata>
      <annotations>
          <token-annotation set="https://raw.githubusercontent.com/LanguageMachines/uctodata/master/setdefinitions/tokconfig-eng.foliaset.ttl">
			 <annotator processor="p1" />
		  </token-annotation>
          <text-annotation>
			 <annotator processor="p1" />
          </text-annotation>
          <sentence-annotation>
			 <annotator processor="p1" />
          </sentence-annotation>
          <paragraph-annotation>
			 <annotator processor="p1" />
          </paragraph-annotation>
          <timesegment-annotation set="events"> <!-- an ad-hoc set -->
			 <annotator processor="p1" />
		  </timesegment-annotation>
      </annotations>
      <provenance>
         <processor xml:id="p1" name="proycon" type="manual" />
      </provenance>
  </metadata>
  <text xml:id="example.text">
    <p xml:id="example.p.1">
        <s xml:id="example.p.1.s.1">
         <w xml:id="example.p.1.s.1.w.1"><t>I</t></w>
         <w xml:id="example.p.1.s.1.w.2"><t>think</t></w>
         <w xml:id="example.p.1.s.1.w.3"><t>I</t></w>
         <w xml:id="example.p.1.s.1.w.4"><t>have</t></w>
         <w xml:id="example.p.1.s.1.w.5"><t>to</t></w>
         <w xml:id="example.p.1.s.1.w.6"><t>go</t></w>
         <w xml:id="example.p.1.s.1.w.7"><t>.</t></w>
         <timing>
          <timesegment class="utterance" begindatetime="2011-12-15T19:01"
           enddatetime="2011-12-15T19:03" actor="myself">
            <wref id="example.p.1.s.1.w.1" t="I" />
            <wref id="example.p.1.s.1.w.2" t="think" />
          </timesegment>
          <timesegment class="cough" begindatetime="2011-12-15T19:03"
           enddatetime="2011-12-15T19:05" actor="myself">
          </timesegment>
          <timesegment class="utterance" begindatetime="2011-12-15T19:05"
           enddatetime="2011-12-15T19:06" actor="myself">
            <wref id="example.p.1.s.1.w.3" t="I" />
            <wref id="example.p.1.s.1.w.4" t="have" />
            <wref id="example.p.1.s.1.w.5" t="to" />
            <wref id="example.p.1.s.1.w.6" t="go" />
          </timesegment>
         </timing>
        </s>
    </p>
  </text>
</FoLiA>

Example in a speech context

The following example illustrates the usage of time segmentation in a speech context. You have to be aware though, that the begintime and endtime attributes can also be directly associated with any structure elements in a speech context, making the use of this annotation type unnecessary or redundant if used this way.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
<?xml version="1.0" encoding="utf-8"?>
<FoLiA xmlns="http://ilk.uvt.nl/folia" version="2.0" xml:id="example">
  <metadata>
      <annotations>
          <token-annotation set="https://raw.githubusercontent.com/LanguageMachines/uctodata/master/setdefinitions/tokconfig-eng.foliaset.ttl">
			 <annotator processor="p1" />
		  </token-annotation>
          <text-annotation>
			 <annotator processor="p1" />
          </text-annotation>
          <utterance-annotation>
			 <annotator processor="p1" />
          </utterance-annotation>
          <timesegment-annotation set="events"> <!-- an ad-hoc set -->
			 <annotator processor="p1" />
		  </timesegment-annotation>
      </annotations>
      <provenance>
         <processor xml:id="p1" name="proycon" type="manual" />
      </provenance>
  </metadata>
  <speech xml:id="example.speech">
    <utt src="ithinkihavetogo.mp3">
     <w xml:id="example.utt.1.w.1"><t>I</t></w>
     <w xml:id="example.utt.1.w.2"><t>think</t></w>
     <w xml:id="example.utt.1.w.3"><t>I</t></w>
     <w xml:id="example.utt.1.w.4"><t>have</t></w>
     <w xml:id="example.utt.1.w.5"><t>to</t></w>
     <w xml:id="example.utt.1.w.6"><t>go</t></w>
     <w xml:id="example.utt.1.w.7"><t>.</t></w>
     <timing>
      <timesegment begintime="00:00:00.000" endtime="00:00:00.250">
        <wref id="example.utt.1.w.1" t="I" />
      </timesegment>
      <timesegment begintime="00:00:00.250" endtime="00:00:00.500">
        <wref id="example.utt.1.w.2" t="think" />
      </timesegment>
      <timesegment begintime="00:00:00.500" endtime="00:00:00.750">
        <wref id="example.utt.1.w.3" t="I" />
      </timesegment>
      <timesegment begintime="00:00:00.750" endtime="00:00:01.000">
        <wref id="example.utt.1.w.4" t="have" />
      </timesegment>
      <timesegment begintime="00:00:01.000" endtime="00:00:01.250">
        <wref id="example.utt.1.w.5" t="to" />
      </timesegment>
      <timesegment begintime="00:00:01.250" endtime="00:00:01.500">
        <wref id="example.utt.1.w.6" t="go" />
      </timesegment>
     </timing>
    </utt>
  </speech>
</FoLiA>