Learn R Programming

analyzer (version 1.0.1)

QQassociation: Association (Correlation) between Categorical Variables

Description

QQassociation finds Association measure between all the variables in data with only categorical columns.

Usage

QQassociation(factb, use = "everything", methods_used)

Arguments

factb

a data frame with all the categorical columns. This should have at least two columns

use

an optional character string giving a method for computing association in the presence of missing values. This must be (complete or an abbreviation of) one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs". If use is "everything", NAs will propagate conceptually, i.e., a resulting value will be NA whenever one of its contributing observations is NA. If use is "all.obs", then the presence of missing observations will produce an error. If use is "complete.obs" then missing values are handled by case wise deletion (and if there are no complete cases, that gives an error). "na.or.complete" is the same unless there are no complete cases, that gives NA

methods_used

a square data.frame which will store the type of association used between the variables. Dimension will be number of variables * number of variables.

Value

a list of two tables with number of rows and column equal to number of columns in factb:

chisq

Table containing p-values of chi-square test

cramers

Table containing Cramer's V

Details

This function measures the association between categorical variables using Chi Square test. This also returns Cramers V value which is a measure of association between two nominal variables, giving a value between 0 and +1 (inclusive). Higher number indicates higher association. Note that, unlike Pearson correlation this doesn't give negative value.

The relation between Cramer's V and Chi Sq test is

$$\sqrt{\frac{\chi ^2}{n*min(k-1,r-1))}}$$

where:

X

is derived from Pearson's chi-squared test

n

is the grand total of observations

k

being the number of columns

r

being the number of rows

The p-value for the significance of Cramer's V is the same one that is calculated using the Pearson's chi-squared test.

See Also

association for association between any type of variables, CCassociation for Association between Continuous (numeric) variables, CQassociation for Association between Continuous-Categorical variables