powered by
Generate a dataset based upon a mixture of Gaussian distributions (with independent features).
generateGaussianDataset( cluster_means, std_dev, N, P, pi, row_names = paste0("Person_", seq(1, N)), col_names = paste0("Gene_", seq(1, P)) )
Named list of ``data``, the generated matrix and ``cluster_IDs``, the generating structure.
A k-vector of cluster means defining the k clusters.
A k-vector of cluster standard deviations defining the k clusters.
The number of samples to generate in the entire dataset.
The number of columns to generate in the dataset.
A k-vector of the expected proportion of points to be drawn from each distribution.
The row names of the generated dataset.
The column names of the generated dataset.
cluster_means <- c(-2, 0, 2) std_dev <- c(1, 1, 1.25) N <- 100 P <- 5 pi <- c(0.3, 0.3, 0.4) generateGaussianDataset(cluster_means, std_dev, N, P, pi)
Run the code above in your browser using DataLab