hierEGA: Hierarchical `EGA`

Description

Estimates EGA using the lower-order solution of cluster_louvain to identify the lower-order dimensions and then uses factor or network loadings to estimate factor or network scores, which are used to estimate the higher-order dimensions

Usage

hierEGA(
  data,
  scores = c("factor", "network"),
  consensus.iter = 1000,
  consensus.method = c("highest_modularity", "most_common", "iterative", "lowest_tefi"),
  uni.method = c("expand", "LE", "louvain"),
  corr = c("cor_auto", "pearson", "spearman"),
  model = c("glasso", "TMFG"),
  model.args = list(),
  algorithm = c("walktrap", "leiden", "louvain"),
  algorithm.args = list(),
  plot.EGA = TRUE,
  plot.args = list()
)

Value

Returns a list of lists containing:

Main Results

hierarhical

The main results list containing:

lower_order Lower order EGA results for the selected methods
higher_order Higher order EGA results for the selected methods

If plot.EGA = TRUE, then:
lower_plot Plot of the lower order results
higher_plot Plot of the higher order results
hier_plot Plot of the lower and higher order results together, side-by-side

Secondary Results

lower_ega

A list containing the lower order EGA results. The $wc does not contain valid results. Do not use its output.

lower_wc

A list containing consensus clustering results:

highest_modularity Community memberships based on the highest modularity across the cluster_louvain applications
most_common Community memberships based on the most commonly found memberships across the cluster_louvain applications
iterative Community memberships based on consensus clustering described by Lancichinetti & Fortunato (2012)
lowest_tefi Community memberships based on the lowest tefi across the cluster_louvain applications
summary_table A data frame summarizing the unique community solutions across the iterations. Down the columns indicate: number of dimensions (N_Dimensions), proportion of times each community solution was identified (Proportion), modularity of each community solution (Modularity), total entropy fit index of each community solution (tefi), and the memberships for each item. Across the rows indicate each unique community solution

factor_results

A list containing higher order results based on factor scores. A list for each consensus.method is provided with their EGA results

network_results

A list containing higher order results based on network scores. A list for each consensus.method is provided with their EGA results

Arguments

data

Matrix or data frame. Variables (down columns) only. Does not accept correlation matrices

scores

Character. How should scores for the higher-order structure be estimated? Defaults to "network" for network scores computed using the net.scores function. Set to "factor" for factor scores computed using fa. Factors are assumed to be correlated using the "oblimin" rotation. NOTE: Factor scores use the number of communities from EGA. Estimated factor may not align with these communities. The plots using factor scores with have higher order factors that may not completely map onto the lower order communities. Look at the $hierarchical$higher_order$lower_loadings to determine the composition of the lower order factors.

By default, both factor and network scores are computed and stored in the output. The selected option only appears in the main output ($hierarchical)

consensus.iter

Numeric. Number of iterations to perform in consensus clustering (see Lancichinetti & Fortunato, 2012). Defaults to 1000

consensus.method

Character. What consensus clustering method should be used? Defaults to "highest_modularity". Current options are:

highest_modularity Uses the community solution that achieves the highest modularity across iterations
most_common Uses the community solution that is found the most across iterations
iterative Identifies the most common community solutions across iterations and determines how often nodes appear in the same community together. A threshold of 0.30 is used to set low proportions to zero. This process repeats iteratively until all nodes have a proportion of 1 in the community solution.
lowest_tefi Uses the community solution that achieves the lowest tefi across iterations

By default, all consensus.method options are computed and stored in the output. The selected method will be used to plot and appear in the main output ($hierarchical)

uni.method

Character. What unidimensionality method should be used? Defaults to "LE". Current options are:

expand Expands the correlation matrix with four variables correlated .50. If number of dimension returns 2 or less in check, then the data are unidimensional; otherwise, regular EGA with no matrix expansion is used. This is the method used in the Golino et al. (2020) Psychological Methods simulation.
LE Applies the Leading Eigenvalue algorithm (cluster_leading_eigen) on the empirical correlation matrix. If the number of dimensions is 1, then the Leading Eigenvalue solution is used; otherwise, regular EGA is used. This is the final method used in the Christensen, Garrido, and Golino (2021) simulation.
louvain Applies the Louvain algorithm (cluster_louvain) on the empirical correlation matrix using a resolution parameter = 0.95. If the number of dimensions is 1, then the Louvain solution is used; otherwise, regular EGA is used. This method was validated in the Christensen (2022) simulation.

corr

Type of correlation matrix to compute. The default uses cor_auto. Current options are:

cor_auto Computes the correlation matrix using the cor_auto function from qgraph.
pearson Computes Pearson's correlation coefficient using the pairwise complete observations via the cor function.
spearman Computes Spearman's correlation coefficient using the pairwise complete observations via the cor function.

model

Character. A string indicating the method to use. Defaults to "glasso". Current options are:

glasso Estimates the Gaussian graphical model using graphical LASSO with extended Bayesian information criterion to select optimal regularization parameter
TMFG Estimates a Triangulated Maximally Filtered Graph

model.args

List. A list of additional arguments for EBICglasso.qgraph or TMFG

algorithm

A string indicating the algorithm to use or a function from igraph Defaults to "walktrap". Current options are:

walktrap Computes the Walktrap algorithm using cluster_walktrap
leiden Computes the Leiden algorithm using cluster_leiden. Defaults to objective_function = "modularity"
louvain Computes the Louvain algorithm using cluster_louvain

algorithm.args

List. A list of additional arguments for cluster_walktrap, cluster_louvain, or some other community detection algorithm function (see examples)

plot.EGA

Boolean. If TRUE, returns a plot of the network and its estimated dimensions. Defaults to TRUE

plot.args

List. A list of additional arguments for the network plot. See ggnet2 for full list of arguments:

vsize Size of the nodes. Defaults to 6.
label.size Size of the labels. Defaults to 5.
alpha The level of transparency of the nodes, which might be a single value or a vector of values. Defaults to 0.7.
edge.alpha The level of transparency of the edges, which might be a single value or a vector of values. Defaults to 0.4.
legend.names A vector with names for each dimension
color.palette The color palette for the nodes. For custom colors, enter HEX codes for each dimension in a vector. See color_palette_EGA for more details and examples

Author

Marcos Jimenez <marcosjnezhquez@gmailcom>, Francisco J. Abad <fjose.abad@uam.es>, Eduardo Garcia-Garzon <egarcia@ucjc.edu>, Hudson Golino <hfg9s@virginia.edu>, Alexander P. Christensen <alexpaulchristensen@gmail.com>, and Luis Eduardo Garrido <luisgarrido@pucmm.edu.do>

References

Lancichinetti, A., & Fortunato, S. (2012). Consensus clustering in complex networks. Scientific Reports, 2(1), 1-7.

Examples

Run this code

# Obtain example data
data <- optimism

if (FALSE) {
# hierEGA example
opt.hier<- hierEGA(
  data = optimism,
  algorithm = "louvain"
)}

Run the code above in your browser using DataLab