
SAF follows the principles of standoff annotation. This means that:

  • the SAF standoff document exists separately to the primary data document;
  • standoff pointers link annotations in the standoff document to regions of the primary data.

Properties of each annot:

  • an identifier
  • a type (eg. token, pos, namedEntity, morphosyntax, …)
  • a source and a target node in lattice
  • [optional] a span (defined by standoff pointers from/to)
  • [optional] deps, a set of edge ids corresponding to edges on which the current edge has a dependency
  • plus the actual content of the annotation, consisting of a combination of the following elements:
    • a simple value attribute
    • simple slot elements: each consists of a name part (eg. surface, weight, tagset, tag, …) and a value string
    • complex features structure (fs) elements: these may be typed, and the format is compatible with the TEI/ISO standard (FSR)
    • complex rmrs elements: following the RmrsDtd


   <annot type='token' id='t1' from='0' to='6' source='v0' target='v1'>
    <slot name='surface'>Andrew</slot>


   <annot type='pos' id='p1' deps='t1' source='v0' target='v1'>
     <slot name='weight'>0.5</slot>
     <slot name='tagset'>CLAWS</slot>
     <slot name='tag'>NNP</slot>

SAMPLE named entity EDGE

   <annot type='namedEntity' id='n1' from='10' to='20' source='v0' target='v1'>
    <slot name='weight'>0.567</slot>
    <slot name='surface'>1987 to 1997</slot>
    <fs type='timespan'>
       <f name='from'>
          <fs type='point'>
            <f name='year'>
              <fs type='1987'/>
       <f name='to'>
          <fs type='point'>
            <f name='year'>
              <fs type='1997'/>

SAMPLE external morphosyntax EDGE

   <annot type='morph' deps='t1' source='v0' target='v1'>
    <slot name='weight'>0.5</slot>
    <slot name='tagset'>morph</slot>
    <slot name='reduced'>SMILE</slot>


See SafDtd.

Sample SAF Document

See SafSample.

Last update: 2021-09-02 by Alexandre Rademaker [edit]