Learn R Programming

MGMM (version 0.3.1)

rGMM: Data Generation from Multivariate Normal Mixture Models

Description

Generates an \(n\times d\) matrix of multivariate normal random vectors with observations as rows. If \(k=1\), all observations belong to the same cluster. If \(k>1\) the observations are generated via two-step precoedure. First, the cluster membership is drawn from a multinomial distribution, with mixture proportions specified by pi. Conditional on cluster membership, the observation is drawn from a multivariate normal distribution, with cluster-specific mean and covariance. The cluster means are provided using M, and the cluster covariance matrices are provided using covs. If \(m>0\), missingness is introduced, completely at random, by setting that proportion of elements in the data matrix to NA.

Usage

rGMM(n, d = 2, k = 1, pi = NULL, miss = 0, means = NULL, covs = NULL)

Arguments

n

Observations (rows).

d

Observation dimension (columns).

k

Number of mixture components. Defaults to 1.

pi

Mixture proportions. If omitted, components are assumed equi-probable.

miss

Proportion of elements missing, \(miss\in[0,1)\).

means

Either a prototype mean vector, or a list of mean vectors. Defaults to the zero vector.

covs

Either a prototype covariance matrix, or a list of covariance matrices. Defaults to the identity matrix.

Value

Numeric matrix with observations as rows. Row numbers specify the true cluster assignments.

See Also

For estimation, see fit.GMM.

Examples

Run this code
# NOT RUN {
set.seed(100)
# Single component without missingness
# Bivariate normal observations
cov <- matrix(c(1, 0.5, 0.5, 1), nrow = 2)
data <- rGMM(n = 1e3, d = 2, k = 1, means = c(2, 2), covs = cov)

# Single component with missingness
# Trivariate normal observations
mean_list <- list(c(-2, -2, -2), c(2, 2, 2))
cov <- matrix(c(1, 0.5, 0.5, 0.5, 1, 0.5, 0.5, 0.5, 1), nrow = 3)
data <- rGMM(n = 1e3, d = 3, k = 2, means = mean_list, covs = cov)

# Two components without missingness
# Trivariate normal observations
mean_list <- list(c(-2, -2, -2), c(2, 2, 2))
cov <- matrix(c(1, 0.5, 0.5, 0.5, 1, 0.5, 0.5, 0.5, 1), nrow = 3)
data <- rGMM(n = 1e3, d = 3, k = 2, means = mean_list, covs = cov)

# Four components with missingness
# Bivariate normal observations
mean_list <- list(c(2, 2), c(2, -2), c(-2, 2), c(-2, -2))
cov <- 0.5 * diag(2)
data <- rGMM(
n = 1000, 
d = 2, 
k = 4, 
pi = c(0.35, 0.15, 0.15, 0.35), 
miss = 0.1, 
means = mean_list, 
covs = cov)
# }

Run the code above in your browser using DataLab