Learn R Programming

nbc4va (version 1.2)

summary.nbc: Summarize a NBC model with metrics

Description

Summarizes the results from a nbc object. The summary can be either for a particular case or for the entirety of cases.

Usage

# S3 method for nbc
summary(object, top = 5, id = NULL, csmfa.obs = NULL, ...)

Arguments

object

The result nbc object.

top

A number that produces top causes depending on id:

  • If (id is char): provide the top causes of the case by probability

  • If (id is NULL): provide the top causes by predicted Cause Specific Mortality Fractions (CSMF)

id

A character representing a case id in the test data.

csmfa.obs

A character vector of the true causes for calculating the CSMF accuracy.

...

Additional arguments to be passed if applicable

Value

out A summary object built from a nbc object with modifications/additions:

  • If (id is char):

    • Additions to a nbc object:

      • $id (char): the case id chosen by the user

      • $top (numeric): the input number of top causes for id

      • $top.prob (vectorof double): the top probabilities for id

    • The following are modified from a nbc object to be id specific: $test, $test.ids, $test.causes, $obs.causes, $prob, $prob.causes, $pred, $pred.causes

  • If (id is NULL):

    • Additions to the nbc object:

      • * indicates that the item is only available if test causes are known

      • ** indicates that the item ignores * if csmfa.obs is given

      • $top.csmf.pred (vectorof double): the top predicted CSMFs by cause

      • $top.csmf.obs* (vectorof double): the top observed CSMFs by cause

      • $metrics.all** (vectorof double): a numeric vector of overall metrics.

        • Names: TruePositives, TrueNegatives, FalsePositives, FalseNegatives, Accuracy, Sensitivity, PCCC, CSMFMaxError, CSMFaccuracy

        • TruePositives* (double): total number of true positives

        • TrueNegatives* (double): total number of true negatives

        • FalsePositives* (double): total number of false positives

        • FalseNegatives* (double): total number of false negatives

        • Sensitivity* (double): the overall sensitivity

        • PCCC* (double): the partial chance corrected concordance

        • CSMFMaxError** (double): the maximum Cause Specific Mortality Fraction Error

        • CSMFaccuracy** (double): the Cause Specific Mortaliy Fraction accuracy

      • $metrics.causes (dataframe): a perfomance table of metrics by cause.

        • Columns: Cause, Sensitivity, CSMFpredicted, CSMFobserved

        • Cause (vectorof char): The unique causes from both the obs and pred inputs

        • Sensitivity* (vectorof double): the sensitivity for a cause

        • CSMFpredicted (vectorof double): the cause specific mortality fraction for a cause given the predicted deaths

        • CSMFobserved* (vectorof double): the cause specific mortality fraction for a cause given the observed deaths

        • TruePositives (vectorof double): The total number of true positives per cause

        • TrueNegatives (vectorof double): The total number of true negatives per cause

        • FalsePositives (vectorof double): The total number of false positives per cause

        • FalseNegatives (vectorof double): The total number of false negatives per cause

        • PredictedFrequency (vectorof double): The occurence of a cause in the pred input

        • ObservedFrequency (vectorof double): The occurence of a cause in the obs input

        • Example:

          Cause Sensitivity Metric-n.. HIV
          0.5 #.. Cause Sensitivity

Details

See Methods documentation for details on calculations and metrics.

See Also

Other main functions: nbc(), plot.nbc(), print.nbc_summary()

Examples

Run this code
# NOT RUN {
library(nbc4va)
data(nbc4vaData)

# Run naive bayes classifier on random train and test data
train <- nbc4vaData[1:50, ]
test <- nbc4vaData[51:100, ]
results <- nbc(train, test)

# Obtain a summary for the results
brief <- summary(results, top=2)  # top 2 causes by CSMF for all test data
briefID <- summary(results, id="v48")  # top 5 causes by probability for case "v48"

# }

Run the code above in your browser using DataLab