Create annotation items programmatically on a single level.
You have to pass in a data frame, called itemsToCreate
, describing
the new items. The required columns depend on the type of the level (ITEM,
EVENT, or SEGMENT).
This function belongs to emuR’s CRUD family of functions, which let the user manipulate items programmatically:
Create items (create_itemsInLevel)
Read items (query)
Update items (update_itemsInLevel)
Delete items (delete_itemsInLevel))
create_itemsInLevel(
emuDBhandle,
itemsToCreate,
calculateEndTimeForSegments = TRUE,
allowGapsAndOverlaps = FALSE,
rewriteAllAnnots = TRUE,
verbose = TRUE
)
emuDB handle as returned by load_emuDB
A data frame with the columns:
session
(character)
bundle
(character)
level
(character)
attribute
(character)
labels
(character)
start_item_seq_idx
(numeric; only when level
refers to a ITEM-typed
level)
start
(numeric, milliseconds; only when level
refers to an EVENT-typed
or SEGMENT-typed level)
end
(numeric, milliseconds; only when level
refers to a SEGMENT-typed
level and calculateEndTimeForSegments
is FALSE
)
Only applicable if the level type is SEGMENT.
If set to TRUE
, then each segment’s end time is automatically aligned
with the start time of the following segment. In that case, user-provided
end times are ignored. The last segment’s end time is the end time of the
annotated media file. If set to FALSE
, then the user has to provide
an end time for each segment.
Only applicable if the level type is SEGMENT
and calculateEndTimeForSegments
is FALSE
.
If set to FALSE
, this function fails when itemsToCreate
contains
gaps or overlaps between segments. The offending segments are returned invisibly.
You can inspect them by assigning the return value to a variable. The return
value will include a new column gap_samples
that indicates the size
of the gap (positive values) or overlap (negative values) with the previous
segment, respectively. It is measured in audio samples, not in milliseconds.
Setting this to TRUE
allows the function to complete even with gaps
and/or overlaps, but this is not recommended as it can cause bugs in
the EMU-webApp.
should changes be written to file system (_annot.json files) (intended for expert use only)
if set to TRUE
, more status messages are printed
This function creates new annotation items on an existing level, in existing bundles.
Regardless of the type of level you are creating items on, your input data
frame itemsToCreate
must describe your new items by specifying the columns
session
, bundle
, level
, attribute
and labels
. level
must have the
same value for all rows, as we can only create items on one level at a time.
attribute
must also have the same value for all rows, and it must be an
existing attribute that belongs to the level
.
A major use case for this function is to obtain a segment list using query,
modify the segment list and feed it to this function. That is why the column
labels
has a plural name: segment lists also have a column labels
and
not label
. The same is true for the sequence index columns introduced below.
Creating new items works differently depending on the level type. The three types are explained in the following sections.
In addition to the columns that are always required, ITEM-typed levels require
a column with a sequence index to be present in the itemsToCreate
data
frame. Its name must be start_item_seq_idx
. This name was chosen instead
of sequence_index
because it is present as a column name in segment lists
obtained with query. That makes it easer to use a segment list as input to
create_itemsInLevel()
.
Along the time axis, there can be multiple annotation items on every level. Their order within the level is given by their sequence index. All existing items have a natural-valued sequence index and there are no gaps in the sequences (i.e. if a level contains N annotation items, they are indexed 1..N).
Any newly created item must be given a sequence index. The sequence index may
be real-valued (it will automatically be replaced with a natural value). To
prepend the new item to the existing ones, pass a value lower than one. To
append it to the existing items, you can either pass NA
or any value that
you know is greater than N (the number of existing items in that level). It
does not need to be exactly N+1. To place the new item between two existing
ones, use any real value between the sequence indexes of the existing neighbors.
If you are appending multiple items at the same time, every sequence index
(including NA
) can only be used once per session/bundle/level combination
(because session/bundle/level/sequence index are the unique identifier of an
item).
After creating the items, all sequence indexes (which may now be real-valued,
natural-valued or NA) are sorted in ascending order and then replaced with
the values 1..N, where N is the number of items on that level. While sorting,
NA
values are placed at the end.
In addition to the columns that are always required, EVENT-typed levels require
a column with the time of the event to be present in the itemsToCreate
data
frame. Its name must be start
. This name was chosen because it is present
as a column name in segment lists obtained with query. That makes it easer
to use a segment list as input to create_itemsInLevel()
. The end
column
in segment lists is 0 for EVENT-typed levels.
The start
column must be given in milliseconds.
You cannot create an EVENT item at a point on the time axis where another item already exists on the same level. If you specify such an event, the entire function will fail.
You can only create SEGMENT-typed items in bundles where the respective level is empty.
In addition to the columns that are always required, SEGMENT-typed levels
require the column start
to be present in the itemsToCreate
data frame,
representing the start time of the segment. It must be given in milliseconds.
Segments also need to have an end, and there are two strategies to determine
the end. Either, you explicitly provide an end
column in the itemsToCreate
data frame. It must be given in milliseconds. If you do that, you have to
specify the calculateEndTimeForSegments
parameter as FALSE
.
Alternatively, you can leave calculateEndTimeForSegments
at TRUE
(which
is the default) and provide your itemsToCreate
data frame without an end
column. In that case, the end time will be aligned to the next neighbor’s
start time. The end time of the last segment will be aligned with the end of
the annotated media file.