descriptor: compute descriptor

Description

compute descriptor

Usage

descriptor(D, ca, ef, ns = min(4, NCOL(D) - 2), lin = FALSE, acc = TRUE, struct = TRUE, pq = c(0.1, 0.25, 0.5, 0.75, 0.9), bivariate = FALSE)

Arguments

: the observed data matrix of size [N,n], where N is the number of samples and n is the number of nodes

: node index ($1 \le ca \le n$) of the putative cause

: node index ($1 \le ef \le n$) of the putative effect

: size of the Markov Blanket

lin

: TRUE OR FALSE. if TRUE it uses a linear model to assess a dependency, otherwise a local learning algorithm

acc

: TRUE OR FALSE. if TRUE it uses the accuracy of the regression as a descriptor

struct

: TRUE or FALSE to use the ranking in the markov blanket as a descriptor

: a vector of quantiles used to compute de descriptor

bivariate

: TRUE OR FALSE. if TRUE it includes the descriptors of the bivariate dependency

Details

This function is the core of the D2C algorithm. Given two candidate nodes, (ca, putative cause and ef, putative effect) it first infers from the dataset D the Markov Blankets of the variables indexed by ca and ef (MBca and MBef) by using the mimr algorithm (Bontempi, Meyer, ICML10). Then it computes a set of (conditional) mutual information terms describing the dependency between the variables ca and ef. These terms are used to create a vector of descriptors. If acc=TRUE, the vector contains the descriptors related to the asymmetric information theoretic terms described in the paper. If struct=TRUE, the vector contains descriptors related to the positions of the terms of the MBef in MBca and viceversa. The estimation of the information theoretic terms require the estimation of the dependency between nodes. If lin=TRUE a linear assumption is made. Otherwise the local learning estimator, implemented by the R package lazy, is used.

References

Gianluca Bontempi, Maxime Flauder (2014) From dependency to causality: a machine learning approach. Under submission

Bontempi G., Meyer P.E. (2010) Causal filter selection in microarray data. ICML'10

M. Birattari, G. Bontempi, and H. Bersini (1999) Lazy learning meets the recursive least squares algorithm. Advances in Neural Information Processing Systems 11, pp. 375-381. MIT Press.

G. Bontempi, M. Birattari, and H. Bersini (1999) Lazy learning for modeling and control design. International Journal of Control, 72(7/8), pp. 643-658.