probaPost: Calculate the conditional probability of belonging to each cluster in a Poisson mixture model

Description

This function computes the conditional probabilities \(t_{ik}\) that an observation i arises from the \(k^{\mathrm{th}}\) component for the current value of the mixture parameters.

Usage

probaPost(y, g, conds, pi, s, lambda)

Value

t: (n x g) matrix made up of the conditional probability of each observation belonging to each of the g clusters

Arguments

y: (n x q) matrix of observed counts for n observations and q variables
g: Number of clusters
conds: Vector of length q defining the condition (treatment group) for each variable (column) in y
pi: Vector of length g containing the current estimate of \(\hat{\boldsymbol{\pi}}\)
s: Vector of length q containing the estimates for the normalized library size parameters for each of the q variables in y
lambda: (d x g) matrix containing the current estimate \(\boldsymbol{\lambda}\), where d is the number of conditions (treatment groups)

Author

Andrea Rau

References

Rau, A., Maugis-Rabusseau, C., Martin-Magniette, M.-L., Celeux G. (2015). Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models. Bioinformatics, 31(9):1420-1427.

Rau, A., Celeux, G., Martin-Magniette, M.-L., Maugis-Rabusseau, C. (2011). Clustering high-throughput sequencing data with Poisson mixture models. Inria Research Report 7786. Available at https://inria.hal.science/inria-00638082.

Examples

Run this code


set.seed(12345)

## Simulate data as shown in Rau et al. (2011)
## Library size setting "A", high cluster separation
## n = 200 observations

simulate <- PoisMixSim(n = 200, libsize = "A", separation = "high")
y <- simulate$y
conds <- simulate$conditions
s <- colSums(y) / sum(y)     ## TC estimate of lib size

## Run the PMM-II model for g = 3
## "TC" library size estimate, EM algorithm

run <- PoisMixClus(y, g = 3, norm = "TC",
 	conds = conds) 
pi.est <- run$pi
lambda.est <- run$lambda

## Calculate the conditional probability of belonging to each cluster
proba <- probaPost(y, g = 3, conds = conds, pi = pi.est, s = s, 
	lambda = lambda.est)

## head(round(proba,2))

Run the code above in your browser using DataLab