This function computes (1) Pearson product-moment correlation matrix to identify variables related to the incomplete variable (i.e., correlates of incomplete variables), (2) Cohen's d matrix comparing cases with and without missing values to identify variables related to the probability of missingness(i.e., correlates of missingness), and (3) semi-partial correlations of an outcome variable conditional on the predictor variables of a substantive model with a set of candidate auxiliary variables to identify correlates of an incomplete outcome variable as suggested by Raykov and West (2016).
na.auxiliary(..., data = NULL, model = NULL, estimator = c("ML", "MLR"),
missing = c("fiml", "two.stage", "robust.two.stage", "doubly.robust"),
tri = c("both", "lower", "upper"), weighted = FALSE, correct = FALSE,
digits = 2, p.digits = 3, as.na = NULL, write = NULL, append = TRUE,
check = TRUE, output = TRUE)
Returns an object of class misty.object
, which is a list with following
entries:
call | function call |
type | type of analysis |
data | data frame used for the current analysis |
model | lavaan model syntax for estimating the semi-partial correlations |
model.fit | fitted lavaan model for estimating the semi-partial correlations |
args | specification of function arguments |
result | list with result tables |
a matrix or data frame with incomplete data, where missing
values are coded as NA
. Alternatively, an expression
indicating the variable names in data
e.g.,
na.auxiliary(x1, x2, x3, data = dat)
. Note that the
operators .
, +
, -
, ~
, :
,
::
, and !
can also be used to select variables,
see 'Details' in the df.subset
function.
a data frame when specifying one or more variables in the
argument ...
. Note that the argument is NULL
when specifying a matrix or data frame for the argument
...
.
a character string specifying the substantive model predicting
an continuous outcome variable using a set of predictor variables
to estimate semi-partial correlations between the outcome
variable and a set of candidate auxiliary variables. The default
setting is model = NULL
, i.e., the function computes
Pearson product-moment correlation matrix and Cohen's d matrix.
a character string indicating the estimator to be used
when estimating semi-partial correlation coefficients, i.e.,
"ML"
for maximum likelihood parameter estimates with
conventional standard errors or "MLR"
(default) maximum
likelihood parameter estimates with Huber-White robust standard
errors.
a character string indicating how to deal with missing data
when estimating semi-partial correlation coefficients,
i.e., "fiml"
for full information maximum likelihood
method, two.stage
for two-stage maximum likelihood
method, robust.two.stage
for robust two-stage maximum
likelihood method, and doubly-robust
for doubly-robust
method (see 'Details' in the item.cfa
function).
The default setting is missing = "fiml"
.
a character string indicating which triangular of the correlation
matrix to show on the console, i.e., both
for upper and
lower triangular, lower
(default) for the lower triangular,
and upper
for the upper triangular.
logical: if TRUE
(default), the weighted pooled standard
deviation is used.
logical: if TRUE
, correction factor for Cohen's d to
remove positive bias in small samples is used.
integer value indicating the number of decimal places digits to be used for displaying correlation coefficients and Cohen's d estimates.
an integer value indicating the number of decimal places to be used for displaying the p-value.
a numeric vector indicating user-defined missing values,
i.e. these values are converted to NA
before conducting
the analysis.
a character string naming a file for writing the output into
either a text file with file extension ".txt"
(e.g.,
"Output.txt"
) or Excel file with file extension
".xlsx"
(e.g., "Output.xlsx"
). If the file
name does not contain any file extension, an Excel file will
be written.
logical: if TRUE
(default), output will be appended
to an existing text file with extension .txt
specified
in write
, if FALSE
existing text file will be
overwritten.
logical: if TRUE
(default), argument specification is checked.
logical: if TRUE
(default), output is shown on the console.
Takuya Yanagida takuya.yanagida@univie.ac.at
Note that non-numeric variables (i.e., factors, character vectors, and logical vectors) are excluded from to the analysis.
Enders, C. K. (2010). Applied missing data analysis. Guilford Press.
Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549-576. https://doi.org/10.1146/annurev.psych.58.110405.085530
Raykov, T., & West, B. T. (2016). On enhancing plausibility of the missing at random assumption in incomplete data analyses via evaluation of response-auxiliary variable correlations. Structural Equation Modeling, 23(1), 45–53. https://doi.org/10.1080/10705511.2014.937848
van Buuren, S. (2018). Flexible imputation of missing data (2nd ed.). Chapman & Hall.
as.na
, na.as
, na.coverage
,
na.descript
, na.indicator
, na.pattern
,
na.prop
, na.test
# Example 1a: Auxiliary variables
na.auxiliary(airquality)
# Example 1b: Alternative specification using the 'data' argument
na.auxiliary(., data = airquality)
# Example 2a: Semi-partial correlation coefficients
na.auxiliary(airquality, model = "Ozone ~ Solar.R + Wind")
# Example 2b: Alternative specification using the 'data' argument
na.auxiliary(Temp, Month, Day, data = airquality, model = "Ozone ~ Solar.R + Wind")
if (FALSE) {
# Example 3: Write Results into a text file
na.auxiliary(airquality, write = "NA_Auxiliary.txt")
}
Run the code above in your browser using DataLab