randProj: Random projections for data in more than two dimensions modelled by an MVN mixture.

Description

Plots random projections given data in more than two dimensions and parameters of an MVN mixture model for the data.

Usage

randProj(data, seeds = 0, ..., 
         type = c("classification", "uncertainty", "errors"), ask = TRUE,
         quantiles = c(0.75,0.95), symbols, scale = FALSE, identify = FALSE, 
         CEX = 1, PCH = ".", xlim, ylim)

Arguments

data

A numeric matrix or data frame of observations. Categorical variables are not allowed. If a matrix or data frame, rows correspond to observations and columns correspond to variables.

seeds

A vector of integers between 0 and 1000, specifying seeds for the random projections. The default value is the single seed 0.

...

Any number of the following: [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

type

Any subset of c("classification","uncertainty","errors"). The function will produce the corresponding plot if it has been supplied sufficient information to do so. If more than one plot is possible then users will be asked to choo

ask

A logical variable indicating whether or not a menu should be produced when more than one plot is possible. The default is ask=TRUE.

quantiles

A vector of length 2 giving quantiles used in plotting uncertainty. The smallest symbols correspond to the smallest quantile (lowest uncertainty), medium-sized (open) symbols to points falling between the given quantiles, and large (filled) sy

symbols

Either an integer or character vector assigning a plotting symbol to each unique class classification. Elements in symbols correspond to classes in classification in order of appearance in classific

scale

A logical variable indicating whether or not the two chosen dimensions should be plotted on the same scale, and thus preserve the shape of the distribution. Default: scale=FALSE

identify

A logical variable indicating whether or not to add a title to the plot identifying the dimensions used.

CEX

An argument specifying the size of the plotting symbols. The default value is 1.

PCH

An argument specifying the symbol to be used when a classificatiion has not been specified for the data. The default value is a small dot ".".

xlim, ylim

Arguments specifying bounds for the ordinate, abscissa of the plot. This may be useful for when comparing plots.

Value

Random projections of the data, possibly showing location of the mixture components, classification, uncertainty, and classficaition errors.

References

C. Fraley and A. E. Raftery (2002a). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association 97:611-631. See http://www.stat.washington.edu/mclust. C. Fraley and A. E. Raftery (2002b). MCLUST:Software for model-based clustering, density estimation and discriminant analysis. Technical Report, Department of Statistics, University of Washington. See http://www.stat.washington.edu/mclust.

Examples

Run this code

data(iris)
irisMatrix <- as.matrix(iris[,1:4])
irisClass <- iris[,5]

msEst <- mstepVVV(irisMatrix, unmap(irisClass))

par(pty = "s", mfrow = c(2,3))
randProj(irisMatrix, seeds = 0:5, truth=irisClass, 
         mu = msEst$mu, sigma = msEst$sigma, z = msEst$z)
do.call("randProj", c(list(data = irisMatrix, seeds = 0:5, truth=irisClass),
                           msEst))

Run the code above in your browser using DataLab