Learn R Programming

mdir (version 0.9.0)

generateGaussianDataset: Generate Gaussian dataset

Description

Generate a dataset based upon a mixture of Gaussian distributions (with independent features).

Usage

generateGaussianDataset(
  cluster_means,
  std_dev,
  N,
  P,
  pi,
  row_names = paste0("Person_", seq(1, N)),
  col_names = paste0("Gene_", seq(1, P))
)

Value

Named list of ``data``, the generated matrix and ``cluster_IDs``, the generating structure.

Arguments

cluster_means

A k-vector of cluster means defining the k clusters.

std_dev

A k-vector of cluster standard deviations defining the k clusters.

N

The number of samples to generate in the entire dataset.

P

The number of columns to generate in the dataset.

pi

A k-vector of the expected proportion of points to be drawn from each distribution.

row_names

The row names of the generated dataset.

col_names

The column names of the generated dataset.

Examples

Run this code
cluster_means <- c(-2, 0, 2)
std_dev <- c(1, 1, 1.25)
N <- 100
P <- 5
pi <- c(0.3, 0.3, 0.4)
generateGaussianDataset(cluster_means, std_dev, N, P, pi)

Run the code above in your browser using DataLab