Detects functional outliers of first three orders, based on the order extended integrated depth for functional data.
shape.fd.outliers(dataf, range = NULL, d = 101, q = 0.05,
method = c("halfspace", "simplicial"), approx = 100, print = FALSE,
plotpairs = FALSE, max.order = 3, exclude.out = TRUE,
output = c("matrix", "list"), identifiers = NULL)
A matrix of logical values of size n*4
, where n
is the sample size. In the first three rows indicators of outlyingness
of the corresponding functions for orders 1
, 2
and 3
are given, in the fourth row the indicator of outlyingness
with respect to the comparison of the first, and third order depths is given. That is, the fist row corresponds to the first order outliers,
the second row to the second order outliers, and the last two rows formally to the third order outliers. Please consult Nagy et al. (2016)
to interpret the notion of shape outlyingness.
Functional dataset, represented by a dataf
object of their arguments
and functional values. n
stands for the number of functions.
The common range of the domain where the fucntions dataf
are observed.
Vector of length 2 with the left and the right end of the interval. Must contain all arguments given in
dataf
.
Grid size to which all the functional data are transformed. For depth computation,
all functional observations are first transformed into vectors of their functional values of length d
corresponding to equi-spaced points in the domain given by the interval range
. Functional values in these
points are reconstructed using linear interpolation, and extrapolation.
The quantile presenting a threshold for the first order outlier detection. Functions with first order integrated depth
smaller than the q
quantile of this sample of depths are flagged as potential outliers. If set to NULL
, the
the outliers are detected from the first order integrated depth after the log-transformation, as for higher order outliers.
The depth that is used in the diagnostic plot. possible values are halfspace
for
the halfspace depth, or simplicial
for the simplicial depth.
For the computation of the third order integrated depth,
the number of approximations used in the computation of the order extended depth. By default
this is set to 100
, meaning that 100
trivariate points are randomly sampled in unit cube, and at these points the trivariate depths of the
corresponding functional values. May be set to 0
to compute the depth at all possible d^3
combinations of the points in the domain. This choice may result in very slow computation, see also depthf.fd1
.
If the rows of X
are named, print=TRUE
enables a graphical output when the names of the outlying curves
are displayed.
If set to TRUE
, the scatter plot of the computed depths for orders 1
, 2
and 3
is
is displayed. Here, the depths corresponding to the flagged outliers are plotted in colour.
Maximal order of shape outlyingness to be computed, can be set to 1
, 2
, or 3
.
Logical variable; exclude the detected lower order outliers in the flagging process? By default TRUE
.
Output method, can be set to matrix
for a matrix with logical entries (TRUE
for outliers), or list
for
a list of outliers.
A vector of names for the data observations. Facilitates identification of outlying functions.
Stanislav Nagy, nagy@karlin.mff.cuni.cz
Using the procedure described in Nagy et al. (2016), the function uses the order extended integrated depths for functions,
see depthf.fd1
and shape.fd.analysis
, to perform informal functional shape outlier detection.
Outliers of the first order (horizontal shift outliers) are found as the functions with q
% of smallest (first order)
integrated depth values. Second and third order outliers (shape outliers) are found using the extension of the boxplot method
for depths as described in the paper Nagy et al. (2016).
Nagy, S., Gijbels, I. and Hlubinka, D. (2017). Depth-based recognition of shape outlying functions. Journal of Computational and Graphical Statistics, 26 (4), 883--893.
depthf.fd1
, shape.fd.analysis
n = 30
dataf = dataf.population()$dataf[1:n]
shape.fd.outliers(dataf,print=TRUE,plotpairs=TRUE,
identifiers=unlist(dataf.population()$identifier)[1:n])
Run the code above in your browser using DataLab