Usage
kmlCov(formula, data, ident, timeVar, nClust = 2:6, nRedraw = 20, family = 'gaussian', effectVar = '', weights = rep(1,nrow(data)) , timeParametric = TRUE, separateSampling = TRUE, max_itr = 100, verbose = TRUE)
Arguments
formula
A symbolic description of the model. In
the parametric case we write for example 'y ~
clust(time+time2) + pop(sex)', here 'time' and 'time2'
will have a different effect according to the cluster,
the 'sex' effect is the same for all the clusters. In the
non-parametric case only one covariate is allowed.
data
A [data.frame] in long format (no missing
values) which means that each line corresponds to one
measure of the observed phenomenon, and one individual
may have multiple measures (lines) identified by an
identity column. In the non-parametric case the totality
of patients must have all the measurements at all fixed
times.
nClust
The number of clusters, at leas 2 an at
most 26.
nRedraw
The number of time the algorithm is re-run
with different starting conditions.
ident
The name of the column identity.
timeVar
Specify the column name of the time
variable.
family
A description of the error distribution and
link function to be used in the model, by default
'gaussian'. This can be a character string naming a
family function, a family function or the result of a
call to a family function. (See 'family' for details of
family functions).
effectVar
An effect, can be a level cluster effect
or not.
weights
Vector of 'prior weights' to be used in
the fitting process, by default the weights are equal to
one.
timeParametric
By default [TRUE] thus parametric
on the time. If [FALSE] then only one covariate is
allowed in the formula and the algorithm used is the
k-means.
separateSampling
By default [TRUE] it means that
the proportions of the clusters are supposed equal in the
classification step, the log-likelihood maximised at each
step of the algorithm is $\sum_{k=1}^{K}\sum_{y_i \in
P_k} \log(f(y_i, \theta_k))$, otherwise the proportions
of clusters are taken into account and the log-likelihood
is $\sum_{k=1}^{K}\sum_{y_i \in P_k}
\log(\lambda_{k}f(y_i, \theta_k))$.
max_itr
The maximum number of iterations fixed at
100.
verbose
Print the output in the console.