Learn R Programming

rpart.utils (version 0.5)

rpart.rules.push: Writes rule tables required to process rpart rules in SQL to an open RODBC connection.

Description

This function handles the process of pushing tabular versions of rpart rules to an RODBC connection. The entire process of generation and writing is completed with a single call, with all necessary subcalls handled within this function.

Usage

rpart.rules.push(object, connection, rulePrefix = NULL, tablePrefix = NULL)

Arguments

object
an rpart object
connection
and open RODBC connection
rulePrefix
A character string to prepend to each rule name to allow for multiple rule sets
tablePrefix
A character string to prepend to each table name to allow for multiple rule sets

Details

Once the tables have been pushed to the database, unpivoted source data can be processed using the rpart model with SQL code similar to the following:

WITH SOURCE AS
(
    SELECT
    ID,
    TYPE,
    VALUE
    FROM DATA
    UNPIVOT
    (
        VALUE FOR TYPE IN (FIELD1, FIELD2, FIELD3)
    )UNPVT
),
MATCHES AS
(
    SELECT
    ID
    ,Subrule
    ,Variable
    ,SR.Value
    ,Less
    ,Greater
    FROM
    SOURCE S
    LEFT JOIN SUBRULES SR
    ON
    TYPE = VARIABLE
    AND (
        S.value = SR.value
        OR S.value < SR.Less
        OR S.value > SR.Greater
        )
    ),
MATCHED_SUBRULES
AS (
    SELECT
    Subrule
    ,ID
    FROM
    MATCHES M
    GROUP BY
    Subrule
    ,ID
    ),
MATCHED_RULES
AS (
    SELECT
    R.[Rule]
    ,MS.*
    FROM
    RULES AS R
    LEFT JOIN MATCHED_SUBRULES MS
    ON R.SUBRULE=MS.SUBRULE AND Leaf='TRUE'
    )
,
COUNTS AS
(
    SELECT
    [RULE]
    ,ID
    ,MATCH_COUNT=COUNT(DISTINCT SUBRULE)
    ,NEEDED_COUNT=(SELECT COUNT(DISTINCT SUBRULE) FROM RULES R WHERE R.[RULE]=MR.[RULE])
    FROM
    MATCHED_RULES MR
    GROUP BY
    [RULE]
    ,ID
    )
SELECT
RULE
,ID
FROM COUNTS
WHERE
MATCH_COUNT=NEEDED_COUNT

The frame is also passed to the database which allows extracting the estimates generated by the rpart model.