Learn R Programming

clusternomics (version 0.1.0)

empiricalBayesPrior: Fit an empirical Bayes prior to the data

Description

Fit an empirical Bayes prior to the data

Usage

empiricalBayesPrior(datasets, distributions = "diagNormal", globalConcentration = 0.1, localConcentration = 0.1, type = "fitRate")

Arguments

datasets
List of data matrices where each matrix represents a context-specific dataset. Each data matrix has the size N times M, where N is the number of data points and M is the dimensionality of the data. The full list of matrices has length C. The number of data points N must be the same for all data matrices.
distributions
Distribution of data in each dataset. Can be either a list of length C where dataDistributions[c] is the distribution of dataset c, or a single string when all datasets have the same distribution. Currently implemented distribution is the 'diagNormal' option for multivariate Normal distribution with diagonal covariance matrix.
globalConcentration
Prior concentration parameter for the global clusters. Small values of this parameter give larger prior probability to smaller number of clusters.
localConcentration
Prior concentration parameter for the local context-specific clusters. Small values of this parameter give larger prior probability to smaller number of clusters.
type
Type of prior that is fitted to the data. The algorithm can fit either rate of the prior covariance matrix, or fit the full covariance matrix to the data.

Value

Returns the prior object that can be used as an input for the contextCluster function.

Examples

Run this code
# Example with simulated data (see vignette for details)
nContexts <- 2
# Number of elements in each cluster
groupCounts <- c(50, 10, 40, 60)
# Centers of clusters
means <- c(-1.5,1.5)
testData <- generateTestData_2D(groupCounts, means)
datasets <- testData$data

# Generate the prior
fullDataDistributions <- rep('diagNormal', nContexts)
prior <- empiricalBayesPrior(datasets, fullDataDistributions, 0.01, 0.1, 'fitRate')

# Fit the model
# 1. specify number of clusters
clusterCounts <- list(global=10, context=c(3,3))
# 2. Run inference
# Number of iterations is just for demonstration purposes, use
# a larger number of iterations in practice!
results <- contextCluster(datasets, clusterCounts,
     maxIter = 10, burnin = 5, lag = 1,
     dataDistributions = 'diagNormal', prior = prior,
     verbose = TRUE)


Run the code above in your browser using DataLab