plotModelDistance.simcam: Model distance plot for SIMCAM model

Description

Shows a plot with distance between one SIMCA model to others.

Usage

# S3 method for simcam
plotModelDistance(
  obj,
  nc = 1,
  type = "h",
  xticks = seq_len(obj$nclasses),
  xticklabels = obj$classnames,
  main = paste0("Model distance (", obj$classnames[nc], ")"),
  xlab = "Models",
  ylab = "",
  ...
)

Arguments

obj: a SIMCAM model (object of class simcam)
nc: one value - number of class (SIMCA model) to show the plot for
type: type of the plot ("h", "l" or "b")
xticks: vector with tick values for x-axis
xticklabels: vector with tick labels for x-axis
main: main plot title
xlab: label for x axis
ylab: label for y axis
...: other plot parameters (see mdaplotg for details)

Details

The plot shows similarity between a selected model and the others as a ratio of residual variance using the following algorithm. Let's take two SIMCA/PCA models, m1 and m2, which have optimal number of components A1 and A2. The models have been calibrated using calibration sets X1 and X2 with number of rows n1 and n2. Then we do the following:

Project X2 to model m1 and compute residuals, E12
Compute variance of the residuals as s12 = sum(E12^2) / n1
Project X1 to model m2 and compute residuals, E21
Compute variance of the residuals as s21 = sum(E21^2) / n2
Compute variance of residuals for m1 as s1 = sum(E1^2) / (n1 - A1 - 1)
Compute variance of residuals for m2 as s2 = sum(E2^2) / (n2 - A2 - 1)

The model distance then can be computed as: d = sqrt((s12 + s21) / (s1 + s2))

As one can see, if the two models and corresponding calibration sets are identical, then the distance will be sqrt((n - A - 1) / n). For example, if n = 25 and A = 2, then the distance between the model and itself is sqrt(22/25) = sqrt(0.88) = 0.938. This case is demonstrated in the example section.

In general, if distance between models is below one classes are overlapping. If it is above 3 the classes are well separated.

Examples

Run this code

# create two calibration sets with n = 25 objects in each
data(iris)
x1 <- iris[1:25, 1:4]
x2 <- iris[51:75, 1:4]

# create to SIMCA models with A = 2
m1 <- simca(x1, 'setosa', ncomp = 2)
m2 <- simca(x2, 'versicolor', ncomp = 2)

# combine the models into SIMCAM class
m <- simcam(list(m1, m2))

# show the model distance plot with distance values as labels
# note, that distance between setosa and setosa is 0.938
plotModelDistance(m, show.labels = TRUE, labels = "values")

Run the code above in your browser using DataLab