MGMM (version 0.3.1)

rGMM: Data Generation from Multivariate Normal Mixture Models


Generates an \(n\times d\) matrix of multivariate normal random vectors with observations as rows. If \(k=1\), all observations belong to the same cluster. If \(k>1\) the observations are generated via two-step precoedure. First, the cluster membership is drawn from a multinomial distribution, with mixture proportions specified by pi. Conditional on cluster membership, the observation is drawn from a multivariate normal distribution, with cluster-specific mean and covariance. The cluster means are provided using M, and the cluster covariance matrices are provided using covs. If \(m>0\), missingness is introduced, completely at random, by setting that proportion of elements in the data matrix to NA.


rGMM(n, d = 2, k = 1, pi = NULL, miss = 0, means = NULL, covs = NULL)



Observations (rows).


Observation dimension (columns).


Number of mixture components. Defaults to 1.


Mixture proportions. If omitted, components are assumed equi-probable.


Proportion of elements missing, \(miss\in[0,1)\).


Either a prototype mean vector, or a list of mean vectors. Defaults to the zero vector.


Either a prototype covariance matrix, or a list of covariance matrices. Defaults to the identity matrix.


Numeric matrix with observations as rows. Row numbers specify the true cluster assignments.

See Also

For estimation, see fit.GMM.


# Single component without missingness
# Bivariate normal observations
cov <- matrix(c(1, 0.5, 0.5, 1), nrow = 2)
data <- rGMM(n = 1e3, d = 2, k = 1, means = c(2, 2), covs = cov)

# Single component with missingness
# Trivariate normal observations
mean_list <- list(c(-2, -2, -2), c(2, 2, 2))
cov <- matrix(c(1, 0.5, 0.5, 0.5, 1, 0.5, 0.5, 0.5, 1), nrow = 3)
data <- rGMM(n = 1e3, d = 3, k = 2, means = mean_list, covs = cov)

# Two components without missingness
# Trivariate normal observations
mean_list <- list(c(-2, -2, -2), c(2, 2, 2))
cov <- matrix(c(1, 0.5, 0.5, 0.5, 1, 0.5, 0.5, 0.5, 1), nrow = 3)
data <- rGMM(n = 1e3, d = 3, k = 2, means = mean_list, covs = cov)

# Four components with missingness
# Bivariate normal observations
mean_list <- list(c(2, 2), c(2, -2), c(-2, 2), c(-2, -2))
cov <- 0.5 * diag(2)
data <- rGMM(
n = 1000, 
d = 2, 
k = 4, 
pi = c(0.35, 0.15, 0.15, 0.35), 
miss = 0.1, 
means = mean_list, 
covs = cov)
# }

