An algorithm to identify whether data were generated from a factor or network model using factor and network loadings. The algorithm uses heuristics based on theory and simulation. These heuristics were then submitted to several deep learning neural networks with 240,000 samples per model with varying parameters.
LCT(
data,
n = NULL,
corr = c("auto", "cor_auto", "pearson", "spearman"),
na.data = c("pairwise", "listwise"),
model = c("BGGM", "glasso", "TMFG"),
algorithm = c("leiden", "louvain", "walktrap"),
uni.method = c("expand", "LE", "louvain"),
iter = 100,
seed = NULL,
verbose = TRUE,
...
)
Returns a list containing:
Prediction of model based on empirical dataset only
Prediction of model based on means of the loadings across the bootstrap replicate samples
Proportions of models suggested across bootstraps
Matrix or data frame. Should consist only of variables to be used in the analysis. Can be raw data or a correlation matrix
Numeric (length = 1).
Sample size if data
provided is a correlation matrix
Character (length = 1).
Method to compute correlations.
Defaults to "auto"
.
Available options:
"auto"
--- Automatically computes appropriate correlations for
the data using Pearson's for continuous, polychoric for ordinal,
tetrachoric for binary, and polyserial/biserial for ordinal/binary with
continuous. To change the number of categories that are considered
ordinal, use ordinal.categories
(see polychoric.matrix
for more details)
"cor_auto"
--- Uses cor_auto
to compute correlations.
Arguments can be passed along to the function
"pearson"
--- Pearson's correlation is computed for all
variables regardless of categories
"spearman"
--- Spearman's rank-order correlation is computed
for all variables regardless of categories
For other similarity measures, compute them first and input them
into data
with the sample size (n
)
Character (length = 1).
How should missing data be handled?
Defaults to "pairwise"
.
Available options:
"pairwise"
--- Computes correlation for all available cases between
two variables
"listwise"
--- Computes correlation for all complete cases in the dataset
Character (length = 1).
Defaults to "glasso"
.
Available options:
"BGGM"
--- Computes the Bayesian Gaussian Graphical Model.
Set argument ordinal.categories
to determine
levels allowed for a variable to be considered ordinal.
See ?BGGM::estimate
for more details
"glasso"
--- Computes the GLASSO with EBIC model selection.
See EBICglasso.qgraph
for more details
"TMFG"
--- Computes the TMFG method.
See TMFG
for more details
Character or
igraph
cluster_*
function (length = 1).
Defaults to "walktrap"
.
Three options are listed below but all are available
(see community.detection
for other options):
"leiden"
--- See cluster_leiden
for more details
"louvain"
--- By default, "louvain"
will implement the Louvain algorithm using
the consensus clustering method (see community.consensus
for more information). This function will implement
consensus.method = "most_common"
and consensus.iter = 1000
unless specified otherwise
"walktrap"
--- See cluster_walktrap
for more details
Character (length = 1).
What unidimensionality method should be used?
Defaults to "louvain"
.
Available options:
"expand"
--- Expands the correlation matrix with four variables correlated 0.50.
If number of dimension returns 2 or less in check, then the data
are unidimensional; otherwise, regular EGA with no matrix
expansion is used. This method was used in the Golino et al.'s (2020)
Psychological Methods simulation
"LE"
--- Applies the Leading Eigenvector algorithm
(cluster_leading_eigen
)
on the empirical correlation matrix. If the number of dimensions is 1,
then the Leading Eigenvector solution is used; otherwise, regular EGA
is used. This method was used in the Christensen et al.'s (2023)
Behavior Research Methods simulation
"louvain"
--- Applies the Louvain algorithm (cluster_louvain
)
on the empirical correlation matrix. If the number of dimensions is 1,
then the Louvain solution is used; otherwise, regular EGA is used.
This method was validated Christensen's (2022) PsyArXiv simulation.
Consensus clustering can be used by specifying either
"consensus.method"
or "consensus.iter"
Numeric (length = 1).
Number of replicate samples to be drawn from a multivariate
normal distribution (uses MASS::mvrnorm
).
Defaults to 100
(recommended)
Numeric (length = 1).
Defaults to NULL
or random results.
Set for reproducible results.
See Reproducibility and PRNG
for more details on random number generation in EGAnet
Boolean (length = 1).
Should progress be displayed?
Defaults to TRUE
.
Set to FALSE
to not display progress
Additional arguments that can be passed on to
auto.correlate
,
network.estimation
,
community.detection
,
community.consensus
, and
EGA
Hudson F. Golino <hfg9s at virginia.edu> and Alexander P. Christensen <alexpaulchristensen at gmail.com>
Model training and validation
Christensen, A. P., & Golino, H. (2021).
Factor or network model? Predictions from neural networks.
Journal of Behavioral Data Science, 1(1), 85-126.
# Get data
data <- psych::bfi[,1:25]
if (FALSE) # Compute LCT
## Factor model
LCT(data)
Run the code above in your browser using DataLab