Diagnostic keys are data structures which help to identify biological
samples, i.e. give them (scientific) names. They are old but still very
popular because they are simple and efficient, sometimes even for not
very experienced user.
The second goal of these keys is the compact representation of biological
diversity. Diagnostic keys are not very far from classification lists
(see 'classifs'), phylogeny trees (like 'phylo' objects in 'ape'
package), from core R 'dendrogram' and 'hclust' objects, and especially
from recursive partitioning objects (e.g., from 'tree' or 'rpart'
packages).
In biology, diagnostic keys exist in many flavors which are possible to
reduce into two main types:
I. Branched keys, where alternatives are separated.
You compare your sample with the first description. Then, if the sample
agrees with first description, you go to second description (these keys
are usually fully dichotomous), then to the third, until you reach the
temninal (name of the organism). If not, you find the alternative
description of the _same level_ (same depth). The main difficulty here is
how to find it.
To help user find descriptions of the same depths, branched keys are
usually presented as _indented_ where each line starts with an indent.
Bigger indent means bigger depth.
Branched or indented keys could be traced at least to 1668, to one of
John Wilkins books:
(and maybe to much earlier scholastic works.)
Indented keys are widely used, especially in English-language
publications.
Another modification could be traced to 1892 when A.
Semenow-Tjan-Shanskij published his serial key:
Serial keys are similar to all branched keys but numbering style is
different. All steps are numbered sequentially but each has a
back-reference to the alternative so user is not required to find the
description of the same depth, they are already here. Serial keys are
strictly dichotomous. They are probably the most space-saving keys, and
still in use, especially in entomology.
II. Bracket keys, where alternatives are together, and user required to
use 'goto' references to take the next step.
They can be traced to the famous "Flora Francoise" (1778) where J.-B.
Lamarck likely used them the first time:
You compare your sample with first description, and if it agrees, go to
where 'goto' reference says. If not, go to second (alternative)
description, and then again use its 'goto'. On the last steps, 'goto' is
just the terminal, the name you want. Sometimes, bracket keys have more
than one alternative (e.g., not fully dichotomous).
Bracket keys pose another difficulty: it is not easy to go back (up) if
you by mistake went into the wrong direction. Williamson (1922) proposed
backreferenced keys where each step supplied with back-reference:
Sometimes, back-references exist only in case where the referenced step
is not immediately before the current.
Bracket keys (backreferenced or not) are probably most popular in
biology, and most international as well.
Here bracket, branched and serial keys are standardized as rectangular
tables (data frames). Each feature (id, backreference, description,
terminal, 'goto') is just one column. In bracket keys, terminal and
'goto' are combined. For example, if you need a bracket key without
backreferences, use three columns: id, description and terminal+'goto'.
Order of columns is important, column name is not. Please see examples to
understand better.
Note that while this format is human-readable, it is not typographic. To
make keys more typographic, user might want to convert them into LaTeX
where several packages allow for typesetting diagnostic keys (for
example, my 'biokey' package.)