Learn R Programming

yaImpute (version 1.0-34.1)

errorStats: Compute error components of k-NN imputations

Description

Error properties of estimates derived from imputation differ from those of regression-based estimates because the two methods include a different mix of error components. This function computes a partitioning of error statistics as proposed by Stage and Crookston (2007).

Usage

errorStats(mahal,...,scale=FALSE,pzero=0.1,plg=0.5,seeMethod="lm")

Value

A list that contains several data frames. The column names of each are a combination of the name of the object used to compute the statistics and the name of the statistic. The rownames correspond the the Y-variables from the first argument. The data frame names are as follows:

common

statistics used to compute other statistics.

name of first argument

error statistics for the first yai object.

names of ... arguments

error statistics for each of the remaining yai objects, if any.

see

standard error of estimate for individual regressions fit for corresponding Y-variables.

rmmsd0

root mean square difference for imputations based on method="mahalanobis" (always based on the first argument to the function).

mlf

square root of the model lack of fit: \(sqrt(see^2 - (rmmsd0^2/2))\).

rmsd

root mean square error.

rmsdlg

root mean square error of the observations with larger distances.

sei

standard error of imputation \(sqrt(rmsd^2 - (rmmsd0^2/2))\).

dstc

distance component: \(sqrt(rmsd^2 - rmmsd0^2)\).

Note that unlike Stage and Crookston (2007), all statistics reported here are in the natural units, not squared units.

Arguments

mahal

An object of class yai computed with method="mahalanobis".

...

Other objects of class yai for which statistics are desired. All objects should be for the same data and variables used for the first argument.

scale

When TRUE, the errors are scaled by their respective standard deviations.

pzero

The lower tail p-value used to pick reference observations that are zero distance from each other (used to compute rmmsd0).

plg

The upper tail p-value used to pick reference observations that are substantially distant from each other (used to compute rmsdlg).

seeMethod

Method used to compute SEE: seeMethod="lm" uses lm and seeMethod="gam" uses gam. In both cases, the model formula is a simple linear combination of the X-variables.

Author

Nicholas L. Crookston ncrookston.fs@gmail.com
Albert R. Stage

Details

See https://academic.oup.com/forestscience/article/53/1/62/4604364

References

Stage, A.R.; Crookston, N.L. (2007). Partitioning error components for accuracy-assessment of near neighbor methods of imputation. For. Sci. 53(1):62-72. https://academic.oup.com/forestscience/article/53/1/62/4604364

See Also

yai, TallyLake

Examples

Run this code

require (yaImpute)

data(TallyLake)

diag(cov(TallyLake[,1:8])) # see col A in Table 3 in Stage and Crookston

mal=yai(x=TallyLake[,9:29],y=TallyLake[,1:8],
        noTrgs=TRUE,method="mahalanobis")


msn=yai(x=TallyLake[,9:29],y=TallyLake[,1:8],
        noTrgs=TRUE,method="msn")


# variable "see" for "mal" matches col B (when squared and scaled)
# other columns don't match exactly as Stage and Crookston used different
# software to compute values

errorStats(mal,msn)

Run the code above in your browser using DataLab