Learn R Programming

prinsimp (version 0.8-8)

simpart: Simple Partition

Description

simpart partitions a $d$-dimensional sample space into two orthonormal subspaces: a simpledim-dimensional nearly null space and a $(d-simpledim)$-dimensional model space. It provides an orthonormal basis for each subspace. The nearly null space basis is defined in terms of a simplicity measure and is ordered from most simple to least simple. The model space basis is made up of leading eigenvectors of the covariance matrix and is ordered by proportion of variance explained.

Returns the result as an object of class simpart.

Usage

simpart(y, simpledim, ...)
"simpart"(formula, simpledim, data = NULL, ...)
"simpart"(y, simpledim, measure = c('first', 'second', 'periodic'), x = seq(d), cov=FALSE, reverse=rep(FALSE, d), na.action, ...)

Arguments

formula
a formula with no response variable, referring only to numeric variables.
y
a matrix or data frame that specifies the data, or a covariance matrix. Data matrix has d columns, covariance matrix is $d x d$.
simpledim
the dimension of the nearly null space of the covariance matrix. It is equal to $d$ minus the dimension of the model space.
measure
a function that calculates a simplicity measure of a vector, based on a non-negative definite symmetric matrix Lambda. There are three built in simplicity measures, specified by 'first', 'second', or 'periodic' that correspond to first divided difference, second divided difference and periodic simplicity respectively. The argument measure can take a user specified function.
data
an optional data frame (or similar: see model.frame) containing the variables in the formula formula. By default the variables are taken from environment(formula).
x
a vector of independent variable values (for functional data), length equal to $d$, the number of columns of y. If not supplied, a sequence from 1 to $d$ is used.
cov
a logical value. If true, then y is assumed to be a $d x d$ covariance matrix. If false, y is assumed to be an $n x d$ data matrix which simpart uses to calculate a $d x d$ covariance matrix.
reverse
a logical vector of length d. If the i-th element is true, the i-th basis vector is "reversed" by multiplication by -1. Basis vectors are arranged with model basis first, then simplicity basis. If length of reverse is less than d, then the remaining entries of reverse are assumed to be false, and the corresponding basis vectors remain unchanged.
na.action
specify how missing data should be treated.
...
arguments passed to or from other methods. If x is a formula one might specify cov or reverse. If "periodic" is chosen as the measure, period is specified as a numeric. If measure is user specified, its arguments are passed here.

Value

simpart returns a list with class "simpart" containing the following components:
model
a $d x (d-simpledim)$ matrix with columns containing the basis of the model space, that is, containing the first $(d-simpledim)$ eigenvectors of the covariance matrix. Basis vectors are arranged in descending order of eigenvalue, that is, in descending order of the proportion of variance explained.
simple
$d x simpledim$ matrix with columns containing the simplicity basis of the nearly null space. Basis vectors are arranged in descending order of simplicity.
variance
list of three components:
model
variances associated with the vectors in the model basis.
simple
variances associated with the vectors in the simplicity basis of the nearly null space.
full
variances associated with eigenvectors of the covariance matrix, that is, its eigenvalues.
simplicity
list of three components:
model
simplicity values of the vectors in the model basis.
simple
eigenvalues of the vectors in the simplicity basis of the nearly null space.
full
simplicity values of the simplicity basis when simpledim=d.
call
the matched call
measure
the simplicity measure used: "first", "second", "periodic" or an user specified measure function
varperc
the percent of variance explained by the corresponding basis vector, as a list of two components:
model
percent of variance explained by the vectors in the model basis.
simple
percent of variance explained by the vectors in the simplicity basis of the nearly null space.
scores
if y is the data matrix, the scores on the basis vector loadings.

Details

simpart is a generic function with "formula" and "default" methods.

simpart implements a method described in Gaydos et al (2013).

When cov=FALSE, the covariance matrix is calculated using the data matrix y. The calculation uses divisor $n$, the number of rows of y.

References

T.L. Gaydos, N.E. Heckman, M. Kirkpatrick, J.R. Stinchcombe, J. Schmitt, J. Kingsolver, J.S. Marron. (2013). Visualizing genetic constraints. Annals of Applied Statistics 7: 860-882.

See Also

summary.simpart, plot.simpart

Examples

Run this code
library(prinsimp)
require(graphics)

## Caterpillar data: estimated covariance from Kingsolver et al (2004)
## Measurements are at temperatures 11, 17, 23, 29, 35, 40
data(caterpillar)

## Analyze 5 dimensional model space, 1 dimensional nearly null space
## First divided difference simplicity measure
simpart(caterpillar, simpledim=1, cov=TRUE)  # Need to specify x

simpart(caterpillar, simpledim=1,
        x=c(11, 17, 23, 29, 35, 40), cov=TRUE)

## Second divided difference simplicity measure and 3-dimensional model space
simpart(caterpillar, simpledim=3, measure="second",
        x=c(11, 17, 23, 29, 35, 40), cov=TRUE)

Run the code above in your browser using DataLab