Treats each observed value as missing and imputes from the imputation
model from amelia
output.
overimpute(output, var, draws = 20, subset, legend = TRUE, xlab, ylab, main,
frontend = FALSE, …)
output from the function amelia
.
column number or variable name of the variable to overimpute.
the number of draws per imputed dataset to generate
overimputations. Total number of simulations will m * draws
where m
is the number of imputations.
an optional vector specifying a subset of observations to be used in the overimputation.
a logical value indicating if a legend should be plotted.
the label for the x-axis. The default is "Observed Values."
the label for the y-axis. The default is "Imputed Values."
main title of the plot. The default is to smartly title the plot using the variable name.
a logical value used internally for the Amelia GUI.
further graphical parameters for the plot.
A list that contains (1) the row in the original data (row
),
(2) the observed value of that observation (orig
), (2) the mean of the
overimputations (mean.overimputed
), (3) the lower bound of the 95% confidence interval of
the overimputations (lower.overimputed
), (4) the upper bound of the 95% confidence interval
of the overimputations (upper.overimputed
), (5) the fraction of the variables that were
missing for that observation in the original data (prcntmiss
),
and (6) a matrix of the raw overimputations, with observations in rows
and the different draws in columns (overimps
).
This function temporarily treats each observed value in
var
as missing and imputes that value based on the imputation
model of output
. The dots are the mean imputation and the
vertical lines are the 90% percent confidence intervals for
imputations of each observed value. The diagonal line is the \(y=x\)
line. If all of the imputations were perfect, then our points would
all fall on the line. A good imputation model would have about 90% of
the confidence intervals containing the truth; that is, about 90% of
the vertical lines should cross the diagonal.
The color of the vertical lines displays the fraction of missing observations in the pattern of missingness for that observation. The legend codes this information. Obviously, the imputations will be much tighter if there are more observed covariates to use to impute that observation.
The subset
argument evaluates in the environment of the
data. That is, it can but is not required to refer to variables in the
data frame as if it were attached.
Other imputation diagnostics are
compare.density
, disperse
, and
tscsPlot
.