setNode: Set Node Relationships

Description

The relationship between a node and its parents must be defined before the appropriate JAGS model statement can be constructed. setNode is the utility by which a user can define the distribution of the node and its relationship to its parents (usually through a model of some sort).

Usage

setNode(
  network,
  node,
  nodeType,
  nodeFitter,
  nodeFormula,
  fitterArgs = list(),
  decision = "current",
  utility = "current",
  fromData = !is.null(network$data),
  ...,
  nodeData = NULL,
  factorLevels = NULL,
  validate = TRUE,
  fitModel = getOption("Hyde_fitModel"),
  policyValues = factorLevels
)
fromData()
fromFormula()

Arguments

network

A HydeNetwork.

node

A node within network. This does not have to be quoted.

nodeType

a valid distribution. See the data set in data(jagsDists) for a complete list of available distributions. See "Choosing a Node Type"

nodeFitter

the fitting function, such as lm or glm. This will probably only be needed when fromData = TRUE.

nodeFormula

A formula object specifying the relationship between a node and its parents. It must use as a term every parent of node. This formula will be pushed through the unexported function factorFormula. See "Coding Factor Levels" for more details.

fitterArgs

Additional arguments to be passed to fitter.

decision

A value of either "current" or a logical value. If "current", the current value of the setting is retained. This allows decision nodes set by setDecisionNode to retain the classification as a decision node if setNode is run after setDecisionNode. If TRUE, the node will be considered a decision node in compileDecisionNetwork. This is only a valid option when the node is of type "dbern" or "dcat". Note: if any character value other than "current" is given, setNode will assume you intended "current".

utility

A value of either "current" or a logical value. If "current", the current value of the setting is retained. This allows utility nodes set by setUtilityNode to retain the classification as a utility node if setNode is run after setUtilityNode. If TRUE, the node will be considered a utility node. This is only a valid option when the node is of type "determ" and it has no children. Note: if any character value other than "current" is given, setNode will assume you intended "current".

fromData

Logical. Determines if a node's relationship is calculated from the data object in network. Defaults to TRUE whenever network has a data object.

...

parameters to be passed to the JAGS distribution function. Each parameter in the distribution function must be named. For example, the parameters to pass to dnorm would be mean='', sd=''. The required parameters can be looked up using the expectedParameters function. If parameters are to be estimated from the data, the functions fromData and fromFormula may be used as placeholders.

nodeData

A data frame with the appropriate data to fit the model for the node. Data passed in this argument are applied only to this specific node. No checks are performed to ensure that all of the appropriate variables (the node and its parents) are included.

factorLevels

A character vector used to specify the levels of factors when data are not provided for a node. The order of factors follows the order provided by the user. This argument is only used when the node type is either dcat or dbern, the node Fitter is not cpt, nodeData is NULL, and no variable for the node exists in the network's data element. If any of those conditions is not met, factorLevels is ignored. This proves particularly important when data are specified in order to prevent a user specification from conflicting with expected factors across nodes.

validate

Logical. Toggles validation of parameters given in .... When passing raw JAGS code (ie, character strings), this will be ignored (with a message), as the validation is applicable to numerical/formula values.

fitModel

Logical. Toggles if the model is fit within the function call. This may be set globally using options('Hyde_fitModel'). See Details for more about when to use this option.

policyValues

A vector of values to be used in the policy matrix when the node is decision node. This may be left NULL for factor variables, which will then draw on the factor levels. For numerical variables, it can be more important: if left NULL and data are available for the node, the first, second, and third quartiles will be used to populate the policy values. If no data are available and no values are provided, policyMatrix and compileDecisionModel are likely to return errors when they are called. Policy values may also be set with setPolicyValues after a network has been defined.

Value

Returns the modified HydeNetwork object.

Choosing a Node Type

Many of the distribution functions defined in JAGS have an equivalen distribution function in R. You may inspect the jagsDists data frame to see the function names in each language. You may specify the distribution function using the R name and it will be translated to the equivalent JAGS function.

You may still use the JAGS names, which allows you to specify a distribution in JAGS that does not have an R equivalent listed. Note, however, that where R functions are supported, HydeNet anticipates the parameter names to be given following R conventions (See the RParameter column of jagsDists.)

Of particular interest are dbern and dcat, which are functions in JAGS that have no immediate equivalent in R. They provide Bernoulli and Multinomial distributions, respectively.

Coding Factor Levels

The nodeFormula argument will accept any valid R formula. If desired, you may use a specific formulation to indicate the presence of factor levels in the formula. For instance, consider the case of a variable y with a binary categorical parent x coded as 0 = No, and 1 = Yes. JAGS expects the formula y ~ c * x == 1 (where c is a constant). However, in factor variables with a large number of levels, it can be difficult to remember what value corresponds to what level.

HydeNet uses an internal (unexported) function within setNode to allow an alternate specification: y ~ c * (x == "Yes"). So long as the factors in the formula are previously defined within the network structure, HydeNet will translate the level into its numeric code.

Note that it is required to write x == "Yes". "Yes" == x will not translate correctly.

Validation

The validation of parameters is performed by comparing the values provided with the limits defined in the jagsDists$paramLogic variable. (look at data(jagsDists, data='HydeNet'). For most node types, validation will be performed for numeric variables. For deterministic variables, the validation will only check that the parameter definition is a formula.

It is possible to pass character strings as definitions, but when this is done, HydeNet assumes you are passing JAGS code. Unfortunately, HydeNet doesn't have to capability to validate JAGS code, so if there is an error in the character string definition, it won't show up until you try to compile the network. If you pass a character string as a parameter and leave validate = TRUE, a message will be printed to indicate that validation is being ignored. This message can be avoided by using validate = FALSE

The two exceptions to this rule are when you pass fromFormula() and fromData() as the parameter definition. These will skip the validation without warning, since the definition will be built by HydeNet and be proper JAGS code (barring any bugs, of course).

Details

The functions fromFormula() and fromData() help to control how Hyde determines the values of parameters passed to JAGS. If the parameters passed in params argument are to be calculated from the data or inferred from the formula, these functions may be used as placeholders instead of writing JAGS code in the params argument.

By default, options(Hyde_fitModel=FALSE). This prevents setNode from fitting any models. Instead, the fitting is delayed until the user calls writeJagsModel and all of the models are fit at the same time. When using large data sets that may require time to run, it may be better to leave this option FALSE so that the models can all be compiled together (especially if you are working interactively). Using fitModel=TRUE will cause the model to be fit and the JAGS code for the parameters to be stored in the nodeParams attribute.

Examples

Run this code

# NOT RUN {
data(PE, package="HydeNet")
Net <- HydeNetwork(~ wells + 
                     pe | wells + 
                     d.dimer | pregnant*pe + 
                     angio | pe + 
                     treat | d.dimer*angio + 
                     death | pe*treat,
                     data = PE) 
print(Net, d.dimer)

#* Manually change the precision
Net <- setNode(Net, d.dimer, nodeType='dnorm', mean=fromFormula(), sd=sqrt(2.65), 
                  nodeFormula = d.dimer ~ pregnant * pe,
                  nodeFitter='lm')
print(Net, d.dimer)

# }

Run the code above in your browser using DataLab