Learn R Programming

logisticPCA (version 0.2)

convexLogisticPCA: Convex Logistic Principal Component Analysis

Description

Dimensionality reduction for binary data by extending Pearson's PCA formulation to minimize Binomial deviance. The convex relaxation to projection matrices, the Fantope, is used.

Usage

convexLogisticPCA(x, k = 2, m = 4, quiet = TRUE, partial_decomp = FALSE, max_iters = 1000, conv_criteria = 1e-06, random_start = FALSE, start_H, mu, main_effects = TRUE, ss_factor = 4, weights, M)

Arguments

x
matrix with all binary entries
k
number of principal components to return
m
value to approximate the saturated model
quiet
logical; whether the calculation should give feedback
partial_decomp
logical; if TRUE, the function uses the rARPACK package to quickly initialize H when ncol(x) is large and k is small
max_iters
number of maximum iterations
conv_criteria
convergence criteria. The difference between average deviance in successive iterations
random_start
logical; whether to randomly inititalize the parameters. If FALSE, function will use an eigen-decomposition as starting value
start_H
starting value for the Fantope matrix
mu
main effects vector. Only used if main_effects = TRUE
main_effects
logical; whether to include main effects in the model
ss_factor
step size multiplier. Amount by which to multiply the step size. Quadratic convergence rate can be proven for ss_factor = 1, but I have found higher values sometimes work better. The default is ss_factor = 4. If it is not converging, try ss_factor = 1.
weights
an optional matrix of the same size as the x with non-negative weights
M
depricated. Use m instead

Value

An S3 object of class clpca which is a list with the following components:
mu
the main effects
H
a rank k Fantope matrix
U
a ceiling(k)-dimentional orthonormal matrix with the loadings
PCs
the princial component scores
m
the parameter inputed
iters
number of iterations required for convergence
loss_trace
the trace of the average negative log likelihood using the Fantope matrix
proj_loss_trace
the trace of the average negative log likelihood using the projection matrix
prop_deviance_expl
the proportion of deviance explained by this model. If main_effects = TRUE, the null model is just the main effects, otherwise the null model estimates 0 for all natural parameters.

References

Landgraf, A.J. & Lee, Y., 2015. Dimensionality reduction for binary data through the projection of natural parameters. arXiv preprint arXiv:1510.06112.

Examples

Run this code
# construct a low rank matrix in the logit scale
rows = 100
cols = 10
set.seed(1)
mat_logit = outer(rnorm(rows), rnorm(cols))

# generate a binary matrix
mat = (matrix(runif(rows * cols), rows, cols) <= inv.logit.mat(mat_logit)) * 1.0

# run convex logistic PCA on it
clpca = convexLogisticPCA(mat, k = 1, m = 4)

Run the code above in your browser using DataLab