Statistical hypothesis testing methods for model-free functional dependency using asymptotic chi-squared or exact distributions. Functional chi-squared test statistics zhang2013deciphering,zhang2014nonparametric,nguyen2018modelfree,zhong2019modelfree,zhong2019eft,Nguyen2020EFTFunChisq are asymmetric, functionally optimal, and model-free, unique from other related statistical measures.
Tests in this package reveal evidence for causality based on the causality-by-functionality principle Simon1966FunChisq. The tests require data from two or more variables be formatted as a contingency table. Continuous variables need to be discretized first, for example, using R packages Ckmeans.1d.dp or GridOnClusters.
The package implements an asymptotic functional chi-squared test zhang2013deciphering,zhang2014nonparametricFunChisq, an adapted functional chi-squared test @Kumar2022AFTFunChisq, and an exact functional test nguyen2018modelfree,zhong2019modelfree,zhong2019eft,Nguyen2020EFTFunChisq. The normalized functional chi-squared test was used by Best Performer NMSUSongLab in HPN-DREAM (DREAM8) Breast Cancer Network Inference Challenges Hill:2016fkFunChisq.
A function index derived from the functional chi-squared offers a new effect size measure for the strength of function dependency. It is asymmetrically functionally optimal, different from the symmetric Cramer's V, also a better alternative to conditional entropy in many aspects.
A simulator is provided to generate functional, dependent non-functional, and independent patterns sharma2017simulatingFunChisq.
For continuous data, these tests offer an advantage over regression analysis when a parametric form cannot be reliably assumed for the underlying function. For categorical data, they provide a novel means to assess directional dependency not possible with symmetrical Pearson's chi-squared test, G-test, or Fisher's exact test.
Yang Zhang, Hua Zhong, Hien Nguyen, Ruby Sharma, Sajal Kumar, Yiyi Li, and Joe Song
Package: | FunChisq |
Type: | Package |
Current version: | 2.5.3 |
Initial release version: | 1.0 |
Initial release date: | 2014-03-08 |
License: | LGPL (>= 3) |
For data discretization, an option is optimal univariate clustering via package Ckmeans.1d.dp. A second option is joint multivariate discretization via package GridOnClusters.
For symmetric dependency tests on discrete data, see Pearson's chi-squared test (chisq.test
), Fisher's exact test (fisher.test
), mutual information (package entropy), and G-test, implemented in packages DescTools and RVAideMemoire.