Annotation scheme

An annotation scheme defines which labels can be assigned to a document and the values these labels can take. You can create a new annotation scheme by clicking the blue "Create new annotation scheme" button at the bottom of the page. Alternatively, you can also edit a scheme by clicking the icon next to the title of the scheme you'd like to edit.

Editing 'active' annotation schemes

In theory, you can edit an annotation scheme at any point, even when users already used this scheme for annotations. This does not affect existing annotations directly, but you will loose context and it will break the behaviour of the annotation interface.

If you know what you are doing, it is mostly safe to add additional labels (however, this will lead to inconsistencies with the status of assignments). You can, at any point, safely edit the name and hints for labels. Note, however, that changing the semantic mid-way through annotating documents will lead to different interpretation of labels in the database, so be mindful.

Background⚓︎

An annotation scheme consists of a list of labels, whereas each label has a name, key, type, and dependent of the type a value. Labels can be constrained in the number of repeats (which are ordered to effectively indicate relevance of a label, e.g. primary vs secondary technology mentioned in a paper on CDR) and whether it is required to complete an annotation. Each label (of certain types) can also have a list of choices, whereas each choice can have a list of labels itself.

In this way, you can construct almost any complex hierarchical annotation scheme. In case you cannot create the logic you'd like to achieve, consider an alternate use of label types or nesting. For example, if you only want to collect labels on methodology and technology if a document is relevant, instead of creating a boolean relevance label and optional labels for the other aspects, consider a single-choice top-level label with the choices "Relevant" and "Not relevant" and have the remaining annotation scheme in as child elements of "Relevant".

Be careful!

The platform does not prevent you from creating an annotation scheme that does not make sense or is erroneous. Make sure you consult this guide, know what you are doing, and ask when in doubt.

In the database, the annotations for a document are in fact not stored in this hierarchy. It is instead "flattened" into a long list of key-value pairs.¹ The implicit hierarchy is reflected by references to the parent label if needed. This flattened format makes it easier later on, for example since it is easy to just export this table with annotations and retain the concept of hierarchy and deal with navigating it only when needed.

Label types⚓︎

The boolean label can be used for simple yes/no annotations and is represented as a toggle switch to the users.
The single-choice label offers list of options to choose from and only exactly one can be selected.
The multi-choice label works the same way, but you can select more than one choice at the same time.

Other types are already implemented, but are missing an interface or methods for label consolidation. For example: Free text (for comments), tagging (e.g. when there are a lot of options), predictive typing tags (suggest open-domain tags that other users already assigned, or add new one), integer (e.g. extracted effect value), float (e.g. extracted effect value with floating point), ... Later on, we also plan to include in-text annotations, meaning that labels can be assigned to a selection in the text.

Interface fields and buttons⚓︎

(A) Move a choice for this multi/single choice label up or down
(B) Drop this choice of this multi/single choice label
(C) Add a list of children (aka sub-annotations) to this choice
(D) Value of this choice (will be stored in value_int for the label of this key in (E)). The value has to be unique for all choices with the same key.
(E) Key for this label, which will be used to track values. The key has to be unique across the entire annotation scheme. It is recommended to use very short keys and have them implicitly reflect the hierarchy in some way.
(F) This is the name of this label that is shown to the users.
(G) Click the pen to edit the title or description. You can use markdown in the description for formatting.
(H) Move a label up or down.
Required: Is set to true (blue), the annotators have to provide a response for this label. No worries, "required" sub-annotations are not enforced if the parent is not selected.
Type: See above.
Max. repeat: See above.

Label colours⚓︎

We use a common colour scheme across the platform that maps true/false of boolean labels to green/red and the integer values of other labels to the following:

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

The color scheme is designed to have an intuitive (no (0) / yes (1) / maybe (2)) setup and you get to pick specific values for specific colours. If the integer value is above that range, it will wrap around and start at 0 (modulo). Note, that you should not rely on this colour scheme too much and that optimising for intuitive values in your scheme might make more sense.

id (uuid), user (uuid), time created (and updated; datetime), item (uuid), scheme (uuid), key (str), repeat (int), parent (self referencing uuid), value_bool (bool), value_int (int), value_float (float), value_str (str), multi_int (list[int]); note, that only exactly one of the value_* fields is not-none depending on the type of label. An empty label (e.g. if a user does not assign a value for that label to the document) does not exist. ↩