Learn R Programming

pcaPP (version 2.0-5)

data.Zou: Test Data Generation for Sparse PCA examples

Description

Draws a sample data set, as introduced by Zou et al. (2006).

Usage

data.Zou (n = 250, p =  c(4, 4, 2), ...)

Value

A matrix of dimension n x sum (p) containing the generated sample data set.

Arguments

n

The required number of observations.

p

A vector of length 3, specifying how many variables shall be constructed using the three factors V1, V2 and V3.

...

Further arguments passed to or from other functions.

Author

Heinrich Fritz, Peter Filzmoser <P.Filzmoser@tuwien.ac.at>

Details

This data set has been introduced by Zou et al. (2006), and then been referred to several times, e.g. by Farcomeni (2009), Guo et al. (2010) and Croux et al. (2011).

The data set contains two latent factors V1 ~ N(0, 290) and V2 ~ N(0, 300) and a third mixed component V3 = -0.3 V1 + 0.925V2 + e; e ~ N(0, 1).
The ten variables Xi of the original data set are constructed the following way:
Xi = V1 + ei; i = 1, 2, 3, 4
Xi = V2 + ei; i = 5, 6, 7, 8
Xi = V3 + ei; i = 9, 10
whereas ei ~ N(0, 1) is indepependent for i = 1 , ..., 10

References

C. Croux, P. Filzmoser, H. Fritz (2011). Robust Sparse Principal Component Analysis Based on Projection-Pursuit, ?? To appear.

A. Farcomeni (2009). An exact approach to sparse principal component analysis, Computational Statistics, Vol. 24(4), pp. 583-604.

J. Guo, G. James, E. Levina, F. Michailidis, and J. Zhu (2010). Principal component analysis with sparse fused loadings, Journal of Computational and Graphical Statistics. To appear.

H. Zou, T. Hastie, R. Tibshirani (2006). Sparse principal component analysis, Journal of Computational and Graphical Statistics, Vol. 15(2), pp. 265-286.

See Also

sPCAgrid, princomp

Examples

Run this code
                   ##  data generation
  set.seed (0)
  x <- data.Zou ()

                   ##  applying PCA
  pc <-  princomp (x)
                   ##  the corresponding non-sparse loadings
  unclass (pc$load[,1:3])
  pc$sdev[1:3]

                   ##  lambda as calculated in the opt.TPO - example
  lambda <- c (0.23, 0.34, 0.005)
                   ##  applying sparse PCA
  spc <- sPCAgrid (x, k = 3, lambda = lambda, method = "sd")
  unclass (spc$load)
  spc$sdev[1:3]

                   ## comparing the non-sparse and sparse biplot
  par (mfrow = 1:2)
  biplot (pc, main = "non-sparse PCs")
  biplot (spc, main = "sparse PCs")

Run the code above in your browser using DataLab