mmas: Subtest construction using the Max-Min-Ant-System

Description

Construct subtests from a given pool of items using the classical Max-Min Ant-System (Stützle, 1998). Allows for multiple constructs, occasions, and groups.

Usage

mmas(
  data,
  factor.structure,
  capacity = NULL,
  item.weights = NULL,
  item.invariance = "congeneric",
  repeated.measures = NULL,
  long.invariance = "strict",
  mtmm = NULL,
  mtmm.invariance = "configural",
  grouping = NULL,
  group.invariance = "strict",
  comparisons = NULL,
  auxiliary = NULL,
  use.order = FALSE,
  software = "lavaan",
  cores = NULL,
  objective = NULL,
  ignore.errors = FALSE,
  burnin = 5,
  ants = 16,
  colonies = 256,
  evaporation = 0.95,
  alpha = 1,
  beta = 1,
  pheromones = NULL,
  heuristics = NULL,
  deposit = "ib",
  localization = "nodes",
  pbest = 0.005,
  tolerance = 0.5,
  schedule = "run",
  analysis.options = NULL,
  suppress.model = FALSE,
  seed = NULL,
  filename = NULL
)

Value

Returns an object of the class stuartOutput for which specific summary and plot methods are available. The results are a list.

call: The called function.
software: The software used to fit the CFA models.
parameters: A list of the ACO parameters used.
analysis.options: A list of the additional arguments passed to the estimation software.
timer: An object of the class proc_time which contains the time used for the analysis.
log: A data.frame containing the optimization history.
log_mat: A list of matrices (e.g. lvcor) relevant to the estimation history, if any.
solution: A list of matrices with the choices made in the global-best solution.
pheromones: A list of matrices with the pheromones of each choice.
subtests: A list containing the names of the selected items and their respective subtests.
final: The results of the estimation of the global-best solution.

Arguments

data: A data.frame containing all relevant data.
factor.structure: A list linking factors to items. The names of the list elements correspond to the factor names. Each list element must contain a character-vector of item names that are indicators of this factor.
capacity: A list containing the number of items per subtest. This must be in the same order as the factor.structure provided. If a single number, it is applied to all subtests. If NULL all items are evenly distributed among the subtests.
item.weights: A placeholder. Currently all weights are assumed to be one.
item.invariance: A character vector of length 1 or the same length as factor.structure containing the desired invariance levels between items pertaining to the same subtest. Currently there are five options: 'congeneric', 'ess.equivalent', 'ess.parallel', 'equivalent', and 'parallel', the first being the default.
repeated.measures: A list linking factors that are repeated measures of each other. Repeated factors must be in one element of the list - other sets of factors in other elements of the list. When this is NULL (the default) a cross-sectional model is estimated.
long.invariance: A character vector of length 1 or the same length as repeated.measures containing the longitudinal invariance level of repeated items. Currently there are four options: 'configural', 'weak', 'strong', and 'strict'. Defaults to 'strict'. When repeated.measures=NULL this argument is ignored.
mtmm: A list linking factors that are measurements of the same construct with different methods. Measurements of the same construct must be in one element of the list - other sets of methods in other elements of the list. When this is NULL (the default) a single method model is estimated.
mtmm.invariance: A character vector of length 1 or the same length as mtmm containing the invariance level of MTMM items. Currently there are five options: 'none', 'configural', 'weak', 'strong', and 'strict'. Defaults to 'configural'. With 'none' differing items are allowed for different methods. When mtmm=NULL this argument is ignored.
grouping: The name of the grouping variable. The grouping variable must be part of data provided and must be a numeric variable.
group.invariance: A single value describing the assumed invariance of items across groups. Currently there are four options: 'configural', 'weak', 'strong', and 'strict'. Defaults to 'strict'. When grouping=NULL this argument is ignored.
comparisons: A character vector containing any combination of 'item', 'long', 'mtmm', and 'group' indicating which invariance should be assessed via model comparisons. The order of the vector dictates the sequence in which model comparisons are performed. Defaults to NULL meaning that no model comparisons are performed.
auxiliary: The names of auxiliary variables in data. These can be used in additional modeling steps that may be provided in analysis.options$model.
use.order: A logical indicating whether or not to take the selection order of the items into account. Defaults to FALSE.
software: The name of the estimation software. Can currently be 'lavaan' (the default) or 'Mplus'. Each option requires the software to be installed.
cores: The number of cores to be used in parallel processing. If NULL (the default) the result of detectCores will be used. On Unix-y machines parallel processing is implemented via mclapply, on Windows machines it is realized via parLapply.
objective: A function that converts the results of model estimation into a pheromone. See 'details' for... details.
ignore.errors: A logical indicating whether or not to ignore estimation problems (such as non positive-definite latent covariance matrices). Defaults to FALSE.
burnin: Number of colonies for which to use fixed objective function before switching to empirical objective. Ignored if objective is not of class stuartEmpiricalObjetive. Defaults to 5.
ants: The number of ants per colony to be estimated. Can either be a single value or an array with two columns for parameter scheduling. See 'details'.
colonies: The maximum number of colonies estimated since finding the latest global-best solution before aborting the process. Can either be a single value or an array with two columns for parameter scheduling. See 'details'.
evaporation: The evaporation coefficient. Can either be a single value or an array with two columns for parameter scheduling. See 'details'.
alpha: The nonlinearity coefficient of the pheromone-trail's contribution to determining selection probabilities. Defaults to 1 (linear). Can either be a single value or an array with two columns for parameter scheduling. See 'details'.
beta: The nonlinearity coefficient of the heuristics' contribution to determining selection probabilities. Defaults to 1 (linear). Can either be a single value or an array with two columns for parameter scheduling. See 'details'.
pheromones: A list of pheromones as created by mmas. This can be used to continue previous runs of this function.
heuristics: An object of the class stuartHeuristic as provided by heuristics which contains heuristic information to be used in determining selection probabilities. If NULL (the default) selection probabilities are determined solely by the pheromones.
deposit: Which deposit rule to use. Can be either 'ib' (the default) for an iteration-best deposit rule, or 'gb' for a global-best deposit rule.
localization: Which localization to use when depositing pheromones. Can be either 'nodes' (the default) for depositing pheromones on selected nodes or 'arcs' for depositing on selection arcs.
pbest: The desired overall probability of constructing the global-best solution when the algorithm convergels. Can either be a single value or an array with two columns for parameter scheduling. See 'details'.
tolerance: The tolerance of imprecision when comparing the pheromones to the upper and lower limits. Can either be a single value or an array with two columns for parameter scheduling. See 'details'.
schedule: The counter which the scheduling of parameters pertains to. Can be either 'run' (the default), for a continuous schedule, 'colony', for a schedule that is restarted every time a new global best is found, or 'mixed' for a schedule that restarts its current phase every time a new global best is found. See 'details'.
analysis.options: A list additional arguments to be passed to the estimation software. The names of list elements must correspond to the arguments changed in the respective estimation software. E.g. analysis.options$model can contain additional modeling commands - such as regressions on auxiliary variables.
suppress.model: A logical indicating whether to suppress the default model generation. If TRUE a model must be provided in analysis.options$model.
seed: A random seed for the generation of random samples. See Random for more details.
filename: The stem of the filenames used to save inputs, outputs, and data files when software='Mplus'. This may include the file path. When NULL (the default) files will be saved to the temporary directory, which is deleted when the R session is ended.

Author

Martin Schultze

Details

The pheromone function provided via objective is used to assess the quality of the solutions. These functions can contain any combination of the fit indices provided by the estimation software. When using Mplus these fit indices are 'rmsea', 'srmr', 'cfi', 'tli', 'chisq' (with 'df' and 'pvalue'), 'aic', 'bic', and 'abic'. With lavaan any fit index provided by inspect can be used. Additionally 'crel' provides an aggregate of composite reliabilites, 'rel' provides a vector or a list of reliability coefficients for the latent variables, 'con' provides an aggregate consistency estimate for MTMM analyses, and 'lvcor' provides a list of the latent variable correlation matrices. For more detailed objective functions 'lambda', 'theta', 'psi', 'alpha', and 'nu' provide the model-implied matrices. Per default a pheromone function using 'crel', 'rmsea', and 'srmr' is used. Please be aware that the objective must be a function with the required fit indices as (correctly named) arguments.

Using model comparisons via the comparisons argument compares the target model to a model with one less degree of assumed invariance (e.g. if your target model contains strong invariance, the comparison model contain weak invariance). Adding comparisons will change the preset for the objective function to include model differences. With comparisons, a custom objective function (the recommended approach) can also include all model fit indices with a preceding delta. to indicate the difference in this index between the two models. If more than one type of comparison is used, the argument of the objective function should end in the type of comparison requested (e.g. delta.cfi.group to use the difference in CFI between the model comparison of invariance across groups).

The scheduling of parameters is possible for the arguments ants, colonies, evaporation, pbest, alpha, beta, tolerance, and deposit. For all of these parameter scheduling is done when an array with two columns is provided. The first column of the array contains the timer, i.e. when to switch between parameter settings, the second column contains the values. The argument schedule can be used to select an absolute schedule (schedule='run'), a relative schedule which resets completely after a new global best is found (schedule='colony'), or a mixed version which resets the current phase of the schedule after a new global best is found (schedule='mixed'). When providing a parameter schedule for iterations 0, 3, and 10 using 'run' will result in a change after the third and the tenth iteration - irrespective of whether global best solutions were found. In contrast, using 'colony' will result in the first setting being used again once a new global best is found. This setting will then be used until iteration 3 (if no new best solution is found) before a switch occurs. If a new global best is found the setting will begin the sequence from the beginning. Using 'mixed' will result in the first setting being used until three consecutive iterations cannot produce a new global best. After this the second setting is used. If a new global best is found, the second setting is kept, but for the purpose of the schedule it is now iteration 3 again, meaning that the third setting will be used later than in a 'run' schedule.

References

Stützle, T. (1998). Local search algorithms for combinatorial problems: Analysis, improvements, and new applications. Unpublished doctoral dissertation. Darmstadt: Fachbereich Informatik, Universität Darmstadt.

Examples

Run this code

# MMAS in a simple situation
# requires lavaan
# number of cores set to 1 in all examples
data(fairplayer)
fs <- list(si = names(fairplayer)[83:92])

# minimal example
sel <- mmas(fairplayer, fs, 4, 
  colonies = 0, ants = 10,  # minimal runtime, remove for application
  seed = 55635, cores = 1)
summary(sel)

# \donttest{
# longitudinal example
data(fairplayer)
fs <- list(si1 = names(fairplayer)[83:92],
  si2 = names(fairplayer)[93:102],
  si3 = names(fairplayer)[103:112])

repe <- list(si = c('si1', 'si2', 'si3'))

# change evaporation rate after 10 and 20 colonies
sel <- mmas(fairplayer, fs, 4, 
  repeated.measures = repe, long.invariance = 'strong',
  evaporation = cbind(c(0, 10, 20), c(.95, .8, .5)),
  seed = 55635, cores = 1)
# }

Run the code above in your browser using DataLab