Learn R Programming

SCGLR (version 3.0)

scglr: Function that fits the scglr model

Description

Calculates the components to predict all the dependent variables.

Usage

scglr(formula, data, family, K = 1, size = NULL, weights = NULL,
  offset = NULL, subset = NULL, na.action = na.omit, crit = list(),
  method = methodSR())

Arguments

formula

an object of class MultivariateFormula (or one that can be coerced to that class): a symbolic description of the model to be fitted.

data

a data frame to be modeled.

family

a vector of character of the same length as the number of dependent variables: "bernoulli", "binomial", "poisson" or "gaussian" is allowed.

K

number of components, default is one.

size

describes the number of trials for the binomial dependent variables. A (number of statistical units * number of binomial dependent variables) matrix is expected.

weights

weights on individuals (not available for now)

offset

used for the poisson dependent variables. A vector or a matrix of size: number of observations * number of Poisson dependent variables is expected.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NAs. The default is set to na.omit.

crit

a list of two elements : maxit and tol, describing respectively the maximum number of iterations and the tolerance convergence criterion for the Fisher scoring algorithm. Default is set to 50 and 10e-6 respectively.

method

structural relevance criterion. Object of class "method.SCGLR" built by methodSR for Structural Relevance.

Value

an object of the SCGLR class.

The function summary (i.e., summary.SCGLR) can be used to obtain or print a summary of the results.

The generic accessor functions coef can be used to extract various useful features of the value returned by scglr.

An object of class "SCGLR" is a list containing following components:

u

matrix of size (number of regressors * number of components), contains the component-loadings, i.e. the coefficients of the regressors in the linear combination giving each component.

comp

matrix of size (number of statistical units * number of components) having the components as column vectors.

compr

matrix of size (number of statistical units * number of components) having the standardized components as column vectors.

gamma

list of length number of dependant variables. Each element is a matrix of coefficients, standard errors, z-values and p-values.

beta

matrix of size (number of regressors + 1 (intercept) * number of dependent variables), contains the coefficients of the regression on the original regressors X.

lin.pred

data.frame of size (number of statistical units * number of dependent variables), the fitted linear predictor.

xFactors

data.frame containing the nominal regressors.

xNumeric

data.frame containing the quantitative regressors.

inertia

matrix of size (number of components * 2), contains the percentage and cumulative percentage of the overall regressors' variance, captured by each component.

logLik

vector of length (number of dependent variables), gives the likelihood of the model of each \(y_k\)'s GLM on the components.

deviance.null

vector of length (number of dependent variables), gives the deviance of the null model of each \(y_k\)'s GLM on the components.

deviance.residual

vector of length (number of dependent variables), gives the deviance of the model of each \(y_k\)'s GLM on the components.

References

Bry X., Trottier C., Verron T. and Mortier F. (2013) Supervised Component Generalized Linear Regression using a PLS-extension of the Fisher scoring algorithm. Journal of Multivariate Analysis, 119, 47-60.

Examples

Run this code
# NOT RUN {
library(SCGLR)

# load sample data
data(genus)

# get variable names from dataset
n <- names(genus)
ny <- n[grep("^gen",n)]    # Y <- names that begins with "gen"
nx <- n[-grep("^gen",n)]   # X <- remaining names

# remove "geology" and "surface" from nx
# as surface is offset and we want to use geology as additional covariate
nx <-nx[!nx%in%c("geology","surface")]

# build multivariate formula
# we also add "lat*lon" as computed covariate
form <- multivariateFormula(ny,c(nx,"I(lat*lon)"),A=c("geology"))

# define family
fam <- rep("poisson",length(ny))

genus.scglr <- scglr(formula=form,data = genus,family=fam, K=4,
 offset=genus$surface)

summary(genus.scglr)
# }

Run the code above in your browser using DataLab