Learn R Programming

bclust (version 1.5)

loglikelihood: computes the model log likelihood useful for estimation of the transformed.par

Description

The function is useful for deriving the maximum likelihood estimates of the model parameters.

Usage

loglikelihood(x.mean,x.css,repno,transformed.par, effect.family="gaussian",var.select=TRUE)

Arguments

x.mean
The mean matrix of the clustering types from the meancss function.
x.css
The corrected sum of squares matrix of the clustering types from the meancss function.
repno
The vector containing the number of replications of each clustering type corresponding to the each row of x.mean and x.css, from the meancss function.
transformed.par
The vector of transformed model parameters that the data likelihood will be evaluated at. The transformation is the log for the variance parameters, the identity for the mean, and the logit for the proportions. The length of the vector depends on the chosen effect.family and var.select.
effect.family
Distribution family of the disappearing random components. Choices are "gaussian" or "alaplace" allowing Gaussian or asymmetric Laplace family, respectively.
var.select
A logical value, TRUE for fitting models that define spike-and-slab in variable level, thus allowing Bayesian variable selection.

Details

Sometimes estimation of the model parameters is difficult, always check the convergence of the optimisation algorithm. The asymmetric Laplace model, effect.family="alaplace", is often more difficult to optimise than effect.family="gaussian".

If data are standardised (having general mean zero and general variance one) the log likelihood function is usually maximised over values between -5 and 5.

The transformed.par is a vector of transformed model parameters having length 5 up to 7 depending on the chosen model.

The transformed.par is $log s2, log s2_h, log s2_t, m, logit p, logit q$ a vector of length 6 when using effect.family = "gaussian" and var.select=TRUE,

and is $log s2, log s2_h, log s2_tL, log s2_tR, m, logit p, logit q$ a vector of length 7

for effect.family="alaplace" and var.select=TRUE.

When var.select=FALSE the $q$ parameter is dropped, yielding a vector of length 5 for

effect.family="gaussian" and a vector of length 6

for effect.family="alaplace".

We assumed a Bayesian linear model being $$y_{vctr}=\mu+\eta_{vct}+\delta_v \gamma_{vc}\theta_{vc}+\varepsilon_{vctr}$$ where $y_{vctr}$ is the available data on variable $v$, cluster(or class) $c$, type $t$, and replicate $r$; $h_{vct}$ is the between-type error, $t_{vc}$ is the disappearing random component controlled by the Bernoulli variables $d_{v}$ with success probability $q$ and $g_{vc}$ with success probability $p$; and $e_{vctr}$ is the between-replicate error. The types inside a cluster (or class) share the same $t_{vc}$, but may arise with a different $h_{vct}$.

The model parameters has natural interpretations, $s2$ is the between replicate error variance; $s2_h$ is the variance of between-type error; $s2_t$ is the variance of the disappearing random component which is decomposed to $s2_tL$, $s2_tR$ the left and the right tail variances if the model is asymmetric Laplace; $m$ is the general level; $p$ is the proportion of active variable-cluster (or variable-class) combinations, and $q$ is the proportion of the active variables. For more details see Vahid Partovi Nia and Anthony C. Davison (2012)

References

Vahid Partovi Nia and Anthony C. Davison (2012). High-Dimensional Bayesian Clustering with Variable Selection: The R Package bclust. Journal of Statistical Software, 47(5), 1-22. URL http://www.jstatsoft.org/v47/i05/

See Also

bclust,bdiscrim, meancss.

Examples

Run this code
data(gaelle)
gaelle.id<-rep(1:14,c(3,rep(4,13))) 
# first 3 rows replication of ColWT, 4 for the rest

mc.gaelle<-meancss(gaelle,gaelle.id)

optimfunc<-function(theta)
{
-loglikelihood(x.mean=mc.gaelle$mean,x.css=mc.gaelle$css,
repno=mc.gaelle$repno,transformed.par=theta)#compute - log likelihood
}

transpar<-optim(rep(0,6),optimfunc,method="BFGS")$par 
#gives argmin(-loglikelihood)
#put a vector of correct length for the evaluation of the likelihood

plot(bclust(gaelle,transformed.par=transpar))

        

Run the code above in your browser using DataLab