Learn R Programming

analyzer (version 1.0.1)

CQassociation: Association (Correlation) between Continuous-Categorical Variables

Description

CQassociation finds Association measure between one categorical and one continuous variable.

Usage

CQassociation(
  numtb,
  factb,
  method3 = c("auto", "parametric", "non-parametric"),
  use = "everything",
  normality_test_method = c("ks", "anderson", "shapiro"),
  normality_test_pval,
  methodMat3 = NULL,
  methods_used
)

Arguments

numtb

a data frame with all the numerical columns. This should have at least two columns

factb

a data frame with all the categorical columns. This should have atleast two columns

method3

method for association between continuous-categorical variables. Values can be "auto", "parametric", "non-parametric". See details for more information. Parametric does t-test while non-parametric does 'Mann-Whitney<U+2019> test.

use

an optional character string giving a method for computing association in the presence of missing values. This must be (complete or an abbreviation of) one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs". If use is "everything", NAs will propagate conceptually, i.e., a resulting value will be NA whenever one of its contributing observations is NA. If use is "all.obs", then the presence of missing observations will produce an error. If use is "complete.obs" then missing values are handled by case wise deletion (and if there are no complete cases, that gives an error). "na.or.complete" is the same unless there are no complete cases, that gives NA

normality_test_method

takes values as 'shapiro' or 'anderson'. this parameter decides which test to perform for the normality test. See details of norm_test_fun for more information.

normality_test_pval

significance level for normality tests. Default is 0.05

methodMat3

method dataframe like methodMats from the function association

methods_used

a square data.frame which will store the type of association used between the variables. Dimension will be number of variables * number of variables.

Value

a table with number of rows equal to number of columns in numtb and number of columns equal to number of columns in factb. Table containing p-values of performed test

Details

This function measures the association between one categorical variable and one continuous variable present in different dataset. Two datasets are provided as input, one data has only numerical columns while other data has only categorical columns. This performs either t-test for the parametric case and 'Mann-Whitney<U+2019> test for the non-parametric case. If the method3 is passed as 'auto', the function defines the method itself based on different tests for equal variance and normality check which checks for assumptions for the t-test. If the assumptions are satisfied, then t-test (parametric) is performed, otherwise 'Mann-Whitney<U+2019> (non-parametric) test is performed.

See Also

norm_test_fun for normality test association for association between any type of variables, CCassociation for Association between Continuous (numeric) variables, QQassociation for Association between Categorical variables