Screen and transform the data to make them more suitable for structure and parameter learning.
# discretize continuous data into factors.
  discretize(data, method, breaks = 3, ordered = FALSE, ..., debug = FALSE)
  # screen continuous data for highly correlated pairs of variables.
  dedup(data, threshold, debug = FALSE)a data frame containing numeric columns (for dedup) or a
    combination of numeric or factor columns (for ).
a numeric value between zero and one, the absolute correlation used a threshold in screening highly correlated pairs.
a character string, either interval for interval
    discretization, quantile for quantile discretization
    (the default) or hartemink for Hartemink's pairwise mutual
    information method.
if method is set to hartemink, an integer number,
    the number of levels the variables are to be discretized into. Otherwise,
    a vector of integer numbers, one for each column of the data set, specifying
    the number of levels for each variable.
a boolean value. If TRUE the discretized variables are
    returned as ordered factors instead of unordered ones.
additional tuning parameters, see below.
a boolean value. If TRUE a lot of debugging output is
    printed; otherwise the function is completely silent.
discretize returns a data frame with the same structure (number of
  columns, column names, etc.) as data, containing the discretized
  variables.
dedup returns a data frame with a subset of the columns of data.
discretize takes a data frame of continuous variables as its first
  argument and returns a secdond data frame of discrete variables, transformed
  using of three methods: interval, quantile or hartemink.
dedup screens the data for pairs of highly correlated variables, and
   discards one in each pair.
Hartemink A (2001). Principled Computational Methods for the Validation and Discovery of Genetic Regulatory Networks. Ph.D. thesis, School of Electrical Engineering and Computer Science, Massachusetts Institute of Technology.
data(gaussian.test)
d = discretize(gaussian.test, method = 'hartemink', breaks = 4, ibreaks = 20)
plot(hc(d))
d2 = dedup(gaussian.test)
Run the code above in your browser using DataLab