- model
see Methods section below.
- Dataset
a list of length \(n_{\mathrm{D}}\) of data frames or objects of class Histogram
.
Data frames should have size \(n \times d\) containing d-dimensional datasets. Each of the \(d\)
columns represents one random variable. Numbers of observations \(n\) equal the number of rows in the datasets.
- Preprocessing
a character giving the preprocessing type. One of "histogram"
,
"kernel density estimation"
or "k-nearest neighbour"
.
- cmax
maximum number of components \(c_{\mathrm{max}} > 0\). The default value is 15
.
- cmin
minimum number of components \(c_{\mathrm{min}} > 0\). The default value is 1
. If \(c_{\mathrm{min}} > 1\), it may happen that no solution is found,
and an error is returned by the method.
- Criterion
a character giving the information criterion type. One of default Akaike "AIC"
, "AIC3"
, "AIC4"
or "AICc"
,
Bayesian "BIC"
, consistent Akaike "CAIC"
, Hannan-Quinn "HQC"
, minimum description length "MDL2"
or "MDL5"
,
approximate weight of evidence "AWE"
, classification likelihood "CLC"
,
integrated classification likelihood "ICL"
or "ICL-BIC"
, partition coefficient "PC"
,
total of positive relative deviations "D"
or sum of squares error "SSE"
.
- pdf
a character vector of length \(d\) containing continuous or discrete parametric family types. One of "normal"
, "lognormal"
, "Weibull"
, "gamma"
, "Gumbel"
, "binomial"
, "Poisson"
, "Dirac"
, "uniform"
or "vonMises"
.
- theta1
a vector of length \(d\) containing initial component parameters. One of \(n_{il} = \textrm{number of categories} - 1\) for "binomial"
distribution.
- theta2
a vector of length \(d\) containing initial component parameters. Currently not used.
- theta3
a vector of length \(d\) containing initial component parameters. One of \(\xi_{il} \in \{-1, \textrm{NA}, 1\}\) for "Gumbel"
distribution.
- K
a character or a vector or a matrix of size \(n_{\mathrm{D}} \times d\) containing numbers of bins \(v\) or \(v_{1}, \ldots, v_{d}\) for the histogram and the kernel density estimation or numbers of nearest
neighbours \(k\) for the k-nearest neighbour. There is no genuine rule to identify \(v\) or \(k\). Consequently,
the REBMIX algorithm identifies them from the set K
of input values by
minimizing the information criterion. The Sturges rule \(v = 1 + \mathrm{log_{2}}(n)\), \(\mathrm{Log}_{10}\) rule \(v = 10 \mathrm{log_{10}}(n)\) or RootN
rule \(v = 2 \sqrt{n}\) can be applied to estimate the limiting numbers of bins
or the rule of thumb \(k = \sqrt{n}\) to guess the intermediate number of nearest neighbours. If, e.g., K = c(10, 20, 40, 60)
and minimum IC
coincides, e.g., 40
, brackets are set to 20
and 60
and the golden section is applied to refine the minimum search.
If, e.g., K = matrix(c(10, 15, 18, 5, 7, 9), byrow = TRUE, ncol = 3)
than \(d = 3\) and the list Dataset
contains \(n_{\mathrm{D}} = 2\) frames. Hence, different numbers of bins can be assigned to \(y_{1}, \ldots, y_{d}\).
See also kseq
for sequence of bins or nearest neighbours generation. The default value is "auto"
.
- ymin
a vector of length \(d\) containing minimum observations. The default value is numeric()
.
- ymax
a vector of length \(d\) containing maximum observations. The default value is numeric()
.
- ar
acceleration rate \(0 < a_{\mathrm{r}} \leq 1\). The default value is 0.1
and in most cases does not have to be altered.
- Restraints
a character giving the restraints type. One of "rigid"
or default "loose"
.
The rigid restraints are obsolete and applicable for well separated components only.
- Mode
a character giving the mode type. One of "all"
, "outliers"
or default "outliersplus"
. The modes are determined in decreasing order of magnitude from all observations if Mode = "all"
.
If Mode = "outliers"
, the modes are determined in decreasing order of magnitude from outliers only. In the meantime, some outliers are reclassified as inliers. Finally, when all observations are inliers, the procedure is completed.
If Mode = "outliersplus"
, the modes are determined in decreasing magnitude from the outliers only. In the meantime, some outliers are reclassified as inliers. Finally, if all observations are inliers, they are converted to outliers and the mode determination procedure is continued.
- EMcontrol
an object of class EM.Control
.
- object
see Methods section below.
- ...
currently not used.