polycor (version 0.8-1)

hetcor: Heterogeneous Correlation Matrix

Description

hetcor computes a heterogenous correlation matrix, consisting of Pearson product-moment correlations between numeric variables, polyserial correlations between numeric and ordinal variables, and polychoric correlations between ordinal variables.

The detectCores function is imported from the parallel package and re-exported.

Usage

hetcor(data, ..., ML = FALSE, std.err = TRUE, 
  use=c("complete.obs", "pairwise.complete.obs"), 
  bins=4, pd=TRUE, parallel=FALSE, ncores=detectCores(logical=FALSE),
  thresholds=FALSE)
# S3 method for data.frame
hetcor(data, ML = FALSE, std.err = TRUE, 
  use = c("complete.obs", "pairwise.complete.obs"), 
  bins=4, pd=TRUE, parallel=FALSE, ncores=detectCores(logical=FALSE), 
  thresholds=FALSE, ...)
# S3 method for default
hetcor(data, ..., ML = FALSE, std.err = TRUE, 
  use=c("complete.obs", "pairwise.complete.obs"), 
  bins=4, pd=TRUE, parallel=FALSE, ncores=detectCores(logical=FALSE),
  thresholds=FALSE)
# S3 method for hetcor
print(x, digits = max(3, getOption("digits") - 3), ...)
# S3 method for hetcor
as.matrix(x, ...)
detectCores(all.tests=FALSE, logical=TRUE)

Arguments

data

a data frame consisting of factors, ordered factors, logical variables, character variables, and/or numeric variables, or the first of several variables.

variables and/or arguments to be passed down.

ML

if TRUE, compute maximum-likelihood estimates; if FALSE, compute quick two-step estimates.

std.err

if TRUE, compute standard errors.

bins

number of bins to use for continuous variables in testing bivariate normality; the default is 4.

pd

if TRUE and if the correlation matrix is not positive-definite, an attempt will be made to adjust it to a positive-definite matrix, using the nearPD function in the Matrix package. Note that default arguments to nearPD are used (except corr=TRUE); for more control call nearPD directly.

parallel

if TRUE (the default is FALSE), perform parallel computations on a computer with multiple CPUs/cores.

ncores

the number of cores to use for parallel computations; the default is the number of physical cores detected.

use

if "complete.obs", remove observations with any missing data; if "pairwise.complete.obs", compute each correlation using all observations with valid data for that pair of variables.

thresholds

if TRUE (the default is FALSE), include the estimated thresholds for polyserial and polychoric correlation in the returned object.

x

an object of class "hetcor" to be printed, or from which to extract the correlation matrix.

digits

number of significant digits.

all.tests

logical, apply all known tests; default is FALSE.

logical

if TRUE, detect logical CPUs/cores; if FALSE, detect physical CPUs/cores.

Value

hetcor returns an object of class "hetcor" with the following components:

correlations

the correlation matrix.

type

the type of each correlation: "Pearson", "Polychoric", or "Polyserial".

std.errors

the standard errors of the correlations, if requested.

n

the number (or numbers) of observations on which the correlations are based.

tests

p-values for tests of bivariate normality for each pair of variables.

NA.method

the method by which any missing data were handled: "complete.obs" or "pairwise.complete.obs".

ML

TRUE for ML estimates, FALSE for two-step estimates.

thresholds

optionally, according to the thresholds argument, a matrix of mode list with a list of thresholds for each polyserial and polychoric correlation in the elements below the diagonal and the type of each correlation (Pearson, polyserial, or polychoric) above the diagonal.

Warning

Be careful with character variables (as opposed to factors), the values of which are ordered alphabetically. Thus, e.g., the values "disagree", "neutral", "agree" are ordered "agree", "disagree", "neutral".

References

Drasgow, F. (1986) Polychoric and polyserial correlations. Pp. 68-74 in S. Kotz and N. Johnson, eds., The Encyclopedia of Statistics, Volume 7. Wiley.

Olsson, U. (1979) Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika 44, 443-460.

Rodriguez, R.N. (1982) Correlation. Pp. 193-204 in S. Kotz and N. Johnson, eds., The Encyclopedia of Statistics, Volume 2. Wiley.

Ghosh, B.K. (1966) Asymptotic expansion for the moments of the distribution of correlation coefficient. Biometrika 53, 258-262.

Olkin, I., and Pratt, J.W. (1958) Unbiased estimation of certain correlation coefficients. Annals of Mathematical Statistics 29, 201-211.

See Also

polychor, polyserial, nearPD, detectCores

Examples

Run this code
# NOT RUN {
if(require(mvtnorm)){
    set.seed(12345)
    R <- matrix(0, 4, 4)
    R[upper.tri(R)] <- runif(6)
    diag(R) <- 1
    R <- cov2cor(t(R) %*% R)
    round(R, 4)  # population correlations
    data <- rmvnorm(1000, rep(0, 4), R)
    round(cor(data), 4)   # sample correlations
    }
if(require(mvtnorm)){
    x1 <- data[,1]
    x2 <- data[,2]
    y1 <- cut(data[,3], c(-Inf, .75, Inf))
    y2 <- cut(data[,4], c(-Inf, -1, .5, 1.5, Inf))
    data <- data.frame(x1, x2, y1, y2)
    hetcor(data)  # Pearson, polychoric, and polyserial correlations, 2-step est.
    }
if(require(mvtnorm)){
    hetcor(x1, x2, y1, y2, ML=TRUE) # Pearson, polychoric, polyserial correlations, ML est.
    }

# }
# NOT RUN {
    hc <- hetcor(data, ML=TRUE)
      # parallel computation:
    hc.m <- hetcor(data, ML=TRUE, parallel=TRUE,
                   ncores=min(2, detectCores()))
    hc.m
    all.equal(hc, hc.m)
    
      # error handling:
    data$y1[data$y2 == "(0.5,1.5]"] <- NA
    hetcor(data)
    
# }

Run the code above in your browser using DataLab