Learn R Programming

parameters (version 0.4.0)

principal_components: Principal Component Analysis (PCA)

Description

This function performs a principal component analysis (PCA) and returns the loadings as a dataframe.

Usage

principal_components(
  x,
  n = "auto",
  rotation = "none",
  sort = FALSE,
  threshold = NULL,
  standardize = TRUE,
  ...
)

Arguments

x

A dataframe or a statistical model.

n

Number of components to extract. If n="all", then n is set as the number of variables minus 1 (ncol(x)-1). If n="auto" (default) or n=NULL, the number of components is selected through n_factors. In reduce_parameters, can also be "max", in which case it will select all the components that are maximally pseudo-loaded (i.e., correlated) by at least one variable.

rotation

If not "none", the PCA will be computed using the psych package. Possible options include "varimax", "quartimax", "promax", "oblimin", "simplimax", and "cluster". See fa for details.

sort

Sort the loadings.

threshold

A value between 0 and 1 indicates which (absolute) values from the loadings should be removed. An integer higher than 1 indicates the n strongest loadings to retain. Can also be "max", in which case it will only display the maximum loading per variable (the most simple structure).

standardize

A logical value indicating whether the variables should be standardized (centred and scaled) to have unit variance before the analysis takes place (in general, such scaling is advisable).

...

Arguments passed to or from other methods.

Value

A data.frame of loadings.

Details

Complexity

Complexity represents the number of latent components needed to account for the observed variables. Whereas a perfect simple structure solution has a complexity of 1 in that each item would only load on one factor, a solution with evenly distributed items has a complexity greater than 1 (Hofman, 1978; Pettersson and Turkheimer, 2010) .

Uniqueness

Uniqueness represents the variance that is 'unique' to the variable and not shared with other variables. It is equal to 1 <U+2013> communality (variance that is shared with other variables). A uniqueness of 0.20 suggests that 20% or that variable's variance is not shared with other variables in the overall factor model. The greater 'uniqueness' the lower the relevance of the variable in the factor model.

MSA

MSA represents the Kaiser-Meyer-Olkin Measure of Sampling Adequacy (Kaiser and Rice, 1974) for each item. It indicates whether there is enough data for each factor give reliable results for the PCA. The value should be > 0.6, and desirable values are > 0.8 (Tabachnick and Fidell, 2013).

PCA or FA?

There is a simplified rule of thumb that may help do decide whether to run a factor analysis or a principal component analysis:

  • Run factor analysis if you assume or wish to test a theoretical model of latent factors causing observed variables.

  • Run principal component analysis If you want to simply reduce your correlated observed variables to a smaller set of important independent composite variables.

(Source: CrossValidated)

References

  • Kaiser, H.F. and Rice. J. (1974). Little jiffy, mark iv. Educational and Psychological Measurement, 34(1):111<U+2013>117

  • Hofmann, R. (1978). Complexity and simplicity as objective indices descriptive of factor solutions. Multivariate Behavioral Research, 13:2, 247-250, 10.1207/s15327906mbr1302_9

  • Pettersson, E., & Turkheimer, E. (2010). Item selection, evaluation, and simple structure in personality data. Journal of research in personality, 44(4), 407-420, 10.1016/j.jrp.2010.03.002

  • Tabachnick, B. G., and Fidell, L. S. (2013). Using multivariate statistics (6th ed.). Boston: Pearson Education.

Examples

Run this code
# NOT RUN {
library(parameters)

principal_components(mtcars[, 1:7], n = "all", threshold = 0.2)
principal_components(mtcars[, 1:7], n = 2, rotation = "oblimin", threshold = "max", sort = TRUE)
principal_components(mtcars[, 1:7], n = 2, threshold = 2, sort = TRUE)

pca <- principal_components(mtcars[, 1:5], n = 2)
summary(pca)
predict(pca)
# }
# NOT RUN {
# Automated number of components
principal_components(mtcars[, 1:4], n = "auto")
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab