Learn R Programming

mdir (version 0.9.0)

callMDI: Call Multiple Dataset Integration

Description

Runs a MCMC chain of the integrative clustering method, Multiple Dataset Integration (MDI), to V datasets.

Usage

callMDI(
  X,
  R,
  thin,
  types,
  K = NULL,
  initial_labels = NULL,
  fixed = NULL,
  alpha = NULL,
  initial_labels_as_intended = FALSE,
  proposal_windows = NULL
)

Value

A named list containing the sampled partitions, component weights, phi and mass parameters, model fit measures and some details on the model call.

Arguments

X

Data to cluster. List of matrices with the N items to cluster held in rows.

R

The number of iterations in the sampler.

thin

The factor by which the samples generated are thinned, e.g. if ``thin=50`` only every 50th sample is kept.

types

Character vector indicating density types to use. 'G' (Gaussian with diagonal covariance matrix) 'MVN' (multivariate normal), 'TAGM' (t-adjust Gaussian mixture), 'GP' (MVN with Gaussian process prior on the mean), 'TAGPM' (TAGM with GP prior on the mean), 'C' (categorical).

K

Vector indicating the number of components to include (the upper bound on the number of clusters in each dataset).

initial_labels

Initial clustering. $N x V$ matrix.

fixed

Which items are fixed in their initial label. $N x V$ matrix.

alpha

The concentration parameter for the stick-breaking prior and the weights in the model.

initial_labels_as_intended

Logical indicating if the passed initial labels are as intended or should ``generateInitialLabels`` be called.

proposal_windows

List of the proposal windows for the Metropolis-Hastings sampling of Gaussian process hyperparameters. Each entry corresponds to a view. For views modelled using a Gaussian process, the first entry is the proposal window for the ampltiude, the second is for the length-scale and the third is for the noise. These are not used in other mixture types.

Examples

Run this code

N <- 100
X <- matrix(c(rnorm(N, 0, 1), rnorm(N, 3, 1)), ncol = 2, byrow = TRUE)
Y <- matrix(c(rnorm(N, 0, 1), rnorm(N, 3, 1)), ncol = 2, byrow = TRUE)

truth <- c(rep(1, N / 2), rep(2, N / 2))
data_modelled <- list(X, Y)

V <- length(data_modelled)

# This R is much too low for real applications
R <- 100
thin <- 5
burn <- 10

K_max <- 10
K <- rep(K_max, V)
types <- rep("G", V)

mcmc_out <- callMDI(data_modelled, R, thin, types, K = K)

Run the code above in your browser using DataLab