The text below was copied from the original sic2004
event, which is no longer online available.
The variable used in the SIC 2004 exercise is natural ambient
radioactivity measured in Germany. The data, provided kindly by the
German Federal Office for Radiation Protection (BfS), are gamma dose rates
reported by means of the national automatic monitoring network (IMIS).
In the frame of SIC2004, a rectangular area was used to select 1008
monitoring stations (from a total of around 2000 stations). For these
1008 stations, 11 days of measurements have been randomly selected during
the last 12 months and the average daily dose rates calculated for each
day. Hence, we ended up having 11 data sets.
Prior information (sic.train): 10 data sets of 200 points that are
identical for what concerns the locations of the monitoring stations
have been prepared. These locations have been randomly selected (see
Figure 1). These data sets differ only by their Z values since each set
corresponds to 1 day of measurement made during the last 14 months. No
information will be provided on the date of measurement. These 10 data
sets (10 days of measurements) can be used as prior information to tune
the parameters of the mapping algorithms. No other information will be
provided about these sets. Participants are free of course to gather
more information about the variable in the literature and so on.
The 200 monitoring stations above were randomly taken from a larger set
of 1008 stations. The remaining 808 monitoring stations have a topology
given in sic.pred. Participants to SIC2004 will have to estimate the
values of the variable taken at these 808 locations.
The SIC2004 data (sic.val, variable dayx):
The exercise consists in using 200 measurements made on a 11th day (THE
data of the exercise) to estimate the values observed at the remaining
808 locations (hence the question marks as symbols in the maps shown
in Figure 3). These measurements will be provided only during two weeks
(15th of September until 1st of October 2004) on a web page restricted
to the participants. The true values observed at these 808 locations
will be released only at the end of the exercise to allow participants
to write their manuscripts (sic.test, variables dayx and joker).
In addition, a joker data set was released (sic.val, variable joker),
which contains an anomaly. The anomaly was generated by a simulation
model, and does not represent measured levels.