Learn R Programming

MixAll (version 1.5.10)

learnMixedData: This function learn the optimal mixture model when the class labels are known according to the criterion among the list of model given in models.

Description

This function learn the optimal mixture model when the class labels are known according to the criterion among the list of model given in models.

Usage

learnMixedData(
  data,
  models,
  labels,
  prop = NULL,
  algo = "impute",
  nbIter = 100,
  epsilon = 1e-08,
  criterion = "ICL",
  nbCore = 1
)

Value

An instance of the [ClusterMixedDataModel] class.

Arguments

data

[list] containing the data sets (matrices and/or data.frames). If data sets contain NA values, these missing values will be estimated during the estimation process.

models

either a [vector] of character or a [list] of same length than data. If models is a vector, it contains the model names to use in order to fit each data set. If models is a list, it must be of the form models = list( modelName, dim, kernelName, modelParameters) Only modelName is required.

labels

vector or factors giving the label class.

prop

[vector] with the proportions of each class. If NULL the proportions will be estimated using the labels.

algo

character defining the algo to used in order to learn the model. Possible values: "simul" (default), "impute" (faster but can produce biased results).

nbIter

integer giving the number of iterations to do. algo is "impute" this is the maximal authorized number of iterations. Default is 100.

epsilon

real giving the variation of the log-likelihood for stopping the iterations. Not used if algo is "simul". Default value is 1e-08.

criterion

character defining the criterion to select the best model. The best model is the one with the lowest criterion value. Possible values: "BIC", "AIC", "ICL", "ML". Default is "ICL".

nbCore

integer defining the number of processors to use (default is 1, 0 for all).

Author

Serge Iovleff

Examples

Run this code
## A quantitative example with the heart disease data set
data(HeartDisease.cat)
data(HeartDisease.cont)
## with default values
ldata = list(HeartDisease.cat, HeartDisease.cont);
models = c("categorical_pk_pjk","gaussian_pk_sjk")
model <- clusterMixedData(ldata, models, nbCluster=2:5, strategy = clusterFastStrategy())

## get summary
summary(model)

## get estimated missing values
missingValues(model)

if (FALSE) {
## print model
print(model)
## use graphics functions
plot(model)
}

Run the code above in your browser using DataLab