summary.nbc: Summarize a NBC model with metrics

Description

Summarizes the results from a nbc object. The summary can be either for a particular case or for the entirety of cases.

Usage

# S3 method for nbc
summary(object, top = 5, id = NULL, csmfa.obs = NULL, ...)

Arguments

object

The result nbc object.

top

A number that produces top causes depending on id:

If (id is char): provide the top causes of the case by probability
If (id is NULL): provide the top causes by predicted Cause Specific Mortality Fractions (CSMF)

A character representing a case id in the test data.

csmfa.obs

A character vector of the true causes for calculating the CSMF accuracy.

...

Additional arguments to be passed if applicable

Value

out A summary object built from a nbc object with modifications/additions:

If (id is char):
- Additions to a nbc object:
  - $id (char): the case id chosen by the user
  - $top (numeric): the input number of top causes for id
  - $top.prob (vectorof double): the top probabilities for id
- The following are modified from a nbc object to be id specific: $test, $test.ids, $test.causes, $obs.causes, $prob, $prob.causes, $pred, $pred.causes
If (id is NULL):
- Additions to the nbc object:
  - * indicates that the item is only available if test causes are known
  - ** indicates that the item ignores * if csmfa.obs is given
  - $top.csmf.pred (vectorof double): the top predicted CSMFs by cause
  - $top.csmf.obs* (vectorof double): the top observed CSMFs by cause
  - $metrics.all** (vectorof double): a numeric vector of overall metrics.
    - Names: TruePositives, TrueNegatives, FalsePositives, FalseNegatives, Accuracy, Sensitivity, PCCC, CSMFMaxError, CSMFaccuracy
    - TruePositives* (double): total number of true positives
    - TrueNegatives* (double): total number of true negatives
    - FalsePositives* (double): total number of false positives
    - FalseNegatives* (double): total number of false negatives
    - Sensitivity* (double): the overall sensitivity
    - PCCC* (double): the partial chance corrected concordance
    - CSMFMaxError** (double): the maximum Cause Specific Mortality Fraction Error
    - CSMFaccuracy** (double): the Cause Specific Mortaliy Fraction accuracy
  - $metrics.causes (dataframe): a perfomance table of metrics by cause.
    - Columns: Cause, Sensitivity, CSMFpredicted, CSMFobserved
    - Cause (vectorof char): The unique causes from both the obs and pred inputs
    - Sensitivity* (vectorof double): the sensitivity for a cause
    - CSMFpredicted (vectorof double): the cause specific mortality fraction for a cause given the predicted deaths
    - CSMFobserved* (vectorof double): the cause specific mortality fraction for a cause given the observed deaths
    - TruePositives (vectorof double): The total number of true positives per cause
    - TrueNegatives (vectorof double): The total number of true negatives per cause
    - FalsePositives (vectorof double): The total number of false positives per cause
    - FalseNegatives (vectorof double): The total number of false negatives per cause
    - PredictedFrequency (vectorof double): The occurence of a cause in the pred input
    - ObservedFrequency (vectorof double): The occurence of a cause in the obs input
    - Example:
      Cause Sensitivity Metric-n.. HIV
      0.5 #.. Cause Sensitivity

Details

See Methods documentation for details on calculations and metrics.

Examples

Run this code

# NOT RUN {
library(nbc4va)
data(nbc4vaData)

# Run naive bayes classifier on random train and test data
train <- nbc4vaData[1:50, ]
test <- nbc4vaData[51:100, ]
results <- nbc(train, test)

# Obtain a summary for the results
brief <- summary(results, top=2)  # top 2 causes by CSMF for all test data
briefID <- summary(results, id="v48")  # top 5 causes by probability for case "v48"

# }

Run the code above in your browser using DataLab

Cause	Sensitivity	Metric-n..	HIV
0.5	#..	Cause	Sensitivity