Learn R Programming

mrfDepth (version 1.0.17)

fOutl: Functional outlyingness measures for functional data

Description

Computes several measures of functional outlyingness for multivariate functional data.

Usage

fOutl(x, z = NULL, type = "fAO", alpha = 0, time = NULL, 
        diagnostic = FALSE, distOptions = NULL)

Value

A list with the following components:

fOutlyingnessX

Vector of length \(n\) containing the functional outlyingness of every curve from x.

fOutlyingnessZ

Vector of length \(m\) containing the functional outlyingness of every curve from z.

weights

Vector of weights according to the input parameter alpha.

crossDistsX

An \(n\) by \(t\) matrix containing the multivariate outlyingness of each observation of x at each point. Only provided if the input parameter diagnostic is set to TRUE.

crossDistsZ

An \(m\) by \(t\) matrix containing the multivariate outlyingness of each observation of z at each point. Only provided if the input parameter diagnostic is set to TRUE.

locOutlX

An \(n\) by \(t\) matrix flagging local outlyingness for x. Only provided if the input parameter diagnostic is set to TRUE.
The \((i,j)\)th element takes value 1 if curve \(x_i\) is outlying at time point \(j\).

locOutlZ

An \(m\) by \(t\) matrix flagging local outlyingness for z. Only provided if the input parameter diagnostic is set to TRUE.
The \((i,j)\)th element takes value 1 if curve \(z_i\) is outlying at time point \(j\).

IndFlagExactFit

Vector containing the indices of the time points for which an exact fit is detected.

Arguments

x

A three dimensional \(t\) by \(n\) by \(p\) array, with \(t\) the number of observed time points, \(n\) the number of functional observations and \(p\) the number of measurements for every functional observation at every time point.

z

An optional three-dimensional \(t\) by \(m\) by \(p\) array, containing the observations for which to compute the functional outlyingness with respect to x. If z is not specified, it is set equal to x. The time points of z should correspond to those of x.

type

The outlyingness measure used in the computations. One of the following options: "fAO", "fSDO", "fDO" or "fbd".
Defaults to "fAO".

alpha

Specifies the weights at every cross-section. When alpha = 0, uniform weights are used. Otherwise alpha should be a weight vector of length \(t\).
Defaults to 0.

time

If the measurements are not equidistant, a sorted numeric vector containing a set of time points.
Defaults to 1:t.

diagnostic

If set to TRUE, the output contains some additional components:
crossDists: an \(n\) by \(t\) matrix containing the multivariate outlyingness of each observation at each time point
locOutl: output containing flags for local outlyingness (see "Value" for more details)
Defaults to FALSE.

distOptions

A list of options to pass to the function computing the cross-sectional distances.
See adjOutl, outlyingness, dirOutl, or bagdistance.

Author

P. Segaert

Details

The functional outlyingness of a multivariate curve with respect to a given set of multivariate curves is defined as the weighted average of its multivariate outlyingness at each time point (Hubert et al., 2015). The functional outlyingness can be computed in all dimensions \(p\) using the adjusted outlyingness (fAO), the directional outlyingness (fDO), the Stahel-Donoho outlyingness (fSDO) or the bagdistance (fbd).

When the data array z is specified, the functional outlyingness and diagnostic information for the data array x is also returned whenever the underlying outlyingness routine allows it. For more information see the specific routines listed in the section "See Also".

In some situations, additional diagnostics are available to flag outlying time points. At each time point, observations from the data array x are marked if they are flagged as outliers. The observations from the data array x are marked if their scaled outlyingness is larger than a prescribed cutoff value from the chi-square distribution. For more details see the respective outlyingness routines.

It is possible that at certain time points a part of the algorithm can not be executed due to e.g. exact fits. In that case the weight of that particular time point is set to zero. A warning is issued at the end of the algorithm to signal these time points. Furthermore the output contains an extra argument giving the indices of the time points where problems occured.

References

Hubert M., Rousseeuw P.J., Segaert P. (2015). Multivariate functional outlier detection (with rejoinder). Statistical Methods and Applications, 24, 177--202.

Hubert M., Rousseeuw P.J., Segaert P. (2017). Multivariate and functional classification using depth and distance. Advances in Data Analysis and Classification, 11, 445--466.

See Also

bagdistance, outlyingness, adjOutl, dirOutl, fom

Examples

Run this code
data(octane)
Data <- octane

# When the option diagnostic is set to TRUE, a crude diagnostic
# to detect outliers can be extracted from the local outlyingness
# indicators. 
Result <- fOutl(x = Data, type = "fAO", diagnostic = TRUE)
matplot(Data[,,1], type = "l", col = "black", lty = 1)
for (i in 1:dim(Data)[2]) {
  if(sum(Result$locOutlZ[i, ]) > 0) {
    obsData <- matrix(Data[,i,1], nrow = 1)
    obsData[!Result$locOutlZ[i,]] <- NA
    obsData <- rbind(obsData, obsData)
    matpoints(t(obsData), col = "red", pch = 15)
  }
}
# For more advanced outlier detection techniques, see the 
# fom routine.

Run the code above in your browser using DataLab