The relationship between a node and its parents must be defined
before the appropriate JAGS model statement can be constructed.
setNode
is the utility by which a user can define the distribution
of the node and its relationship to its parents (usually through a model
of some sort).
setNode(
network,
node,
nodeType,
nodeFitter,
nodeFormula,
fitterArgs = list(),
decision = "current",
utility = "current",
fromData = !is.null(network$data),
...,
nodeData = NULL,
factorLevels = NULL,
validate = TRUE,
fitModel = getOption("Hyde_fitModel"),
policyValues = factorLevels
)fromData()
fromFormula()
A HydeNetwork
.
A node within network
. This does not have to be quoted.
a valid distribution. See the data set
in data(jagsDists)
for a complete list of available distributions.
See "Choosing a Node Type"
the fitting function, such as lm
or glm
. This
will probably only be needed when fromData = TRUE
.
A formula object specifying the relationship between a
node and its parents. It must use as a term every parent of node
. This formula
will be pushed through the unexported function factorFormula
. See
"Coding Factor Levels" for more details.
Additional arguments to be passed to fitter
.
A value of either "current"
or a logical value.
If "current"
, the current value of the setting is retained. This allows
decision nodes set by setDecisionNode
to retain the classification as a
decision node if setNode
is run after setDecisionNode
.
If TRUE
, the node will be considered a
decision node in compileDecisionNetwork
. This is only a valid
option when the node is of type "dbern"
or "dcat"
. Note: if any
character value other than "current"
is given, setNode
will assume
you intended "current"
.
A value of either "current"
or a logical value.
If "current"
, the current value of the setting is retained. This allows
utility nodes set by setUtilityNode
to retain the classification as a
utility node if setNode
is run after setUtilityNode
.
If TRUE
, the node will be considered a
utility node. This is only a valid option when the node is of type
"determ"
and it has no children.
Note: if any
character value other than "current"
is given, setNode
will assume
you intended "current"
.
Logical. Determines if a node's relationship is calculated
from the data object in network
. Defaults to TRUE
whenever
network
has a data object.
parameters to be passed to the JAGS distribution function. Each parameter
in the distribution function must be named. For
example, the parameters to pass to dnorm
would be mean='', sd=''
.
The required parameters can be looked up using the
expectedParameters
function. If parameters are to be estimated
from the data, the functions fromData
and fromFormula
may
be used as placeholders.
A data frame with the appropriate data to fit the model for the node. Data passed in this argument are applied only to this specific node. No checks are performed to ensure that all of the appropriate variables (the node and its parents) are included.
A character vector used to specify the levels of factors
when data are not provided for a node. The order of factors follows the
order provided by the user. This argument is only used when the node type
is either dcat
or dbern
, the node Fitter is not cpt
,
nodeData
is NULL
, and no variable for the node exists in
the network's data
element. If any of those conditions is not met,
factorLevels
is ignored. This proves particularly important when
data are specified in order to prevent a user specification from conflicting
with expected factors across nodes.
Logical. Toggles validation of parameters given in ...
.
When passing raw JAGS code (ie, character strings), this will be ignored
(with a message),
as the validation is applicable to numerical/formula values.
Logical. Toggles if the model is fit within the function call.
This may be set globally using options('Hyde_fitModel')
. See Details
for more about when to use this option.
A vector of values to be used in the policy matrix when
the node is decision node. This may be left NULL
for factor
variables, which will then draw on the factor levels. For numerical
variables, it can be more important: if left NULL
and data are
available for the node, the first, second, and third quartiles will
be used to populate the policy values. If no data are available and no
values are provided, policyMatrix
and compileDecisionModel
are likely to return errors when they are called. Policy values may
also be set with setPolicyValues
after a network has been defined.
Returns the modified HydeNetwork
object.
Many of the distribution functions defined in JAGS have an equivalen
distribution function in R. You may inspect the jagsDists
data
frame to see the function names in each language. You may specify
the distribution function using the R name and it will be translated
to the equivalent JAGS function.
You may still use the JAGS names, which allows you to specify a
distribution in JAGS that does not have an R equivalent listed. Note,
however, that where R functions are supported, HydeNet
anticipates
the parameter names to be given following R conventions (See
the RParameter
column of jagsDists
.)
Of particular interest are dbern
and dcat
, which are
functions in JAGS that have no immediate equivalent in R. They provide
Bernoulli and Multinomial distributions, respectively.
The nodeFormula
argument will accept any valid R formula. If desired, you
may use a specific formulation to indicate the presence of factor levels in the
formula. For instance, consider the case of a variable y
with a binary
categorical parent x
coded as 0 = No, and 1 = Yes. JAGS expects the
formula y ~ c * x == 1
(where c
is a constant). However, in
factor variables with a large number of levels, it can be difficult to remember
what value corresponds to what level.
HydeNet
uses an internal (unexported) function within setNode
to allow
an alternate specification: y ~ c * (x == "Yes")
. So long as the factors in
the formula are previously defined within the network structure, HydeNet
will translate the level into its numeric code.
Note that it is required to write x == "Yes"
. "Yes" == x
will not
translate correctly.
The validation of parameters is performed by comparing the values provided with
the limits defined in the jagsDists$paramLogic
variable. (look at
data(jagsDists, data='HydeNet')
. For most node types, validation will
be performed for numeric variables. For deterministic variables, the validation
will only check that the parameter definition is a formula.
It is possible to pass character strings as definitions, but when this is done,
HydeNet
assumes you are passing JAGS code. Unfortunately, HydeNet
doesn't have to capability to validate JAGS code, so if there is an error in
the character string definition, it won't show up until you try to compile the
network. If you pass a character string as a parameter and leave
validate = TRUE
, a message will be printed to indicate that validation
is being ignored. This message can be avoided by using validate = FALSE
The two exceptions to this rule are when you pass fromFormula()
and
fromData()
as the parameter definition. These will skip the validation
without warning, since the definition will be built by HydeNet
and be
proper JAGS code (barring any bugs, of course).
The functions fromFormula()
and fromData()
help to control
how Hyde
determines the values of parameters passed to JAGS. If the
parameters passed in params
argument are to be calculated from the
data or inferred from the formula, these functions may be used as placeholders
instead of writing JAGS code in the params
argument.
By default, options(Hyde_fitModel=FALSE)
. This prevents setNode
from fitting any models. Instead, the fitting is delayed until the user
calls writeJagsModel
and all of the models are fit at the same time.
When using large data sets that may require time to run, it may be better to
leave this option FALSE
so that the models can all be compiled together
(especially if you are working interactively). Using fitModel=TRUE
will cause the model to be fit and the JAGS code for the parameters to be
stored in the nodeParams
attribute.
# NOT RUN {
data(PE, package="HydeNet")
Net <- HydeNetwork(~ wells +
pe | wells +
d.dimer | pregnant*pe +
angio | pe +
treat | d.dimer*angio +
death | pe*treat,
data = PE)
print(Net, d.dimer)
#* Manually change the precision
Net <- setNode(Net, d.dimer, nodeType='dnorm', mean=fromFormula(), sd=sqrt(2.65),
nodeFormula = d.dimer ~ pregnant * pe,
nodeFitter='lm')
print(Net, d.dimer)
# }
Run the code above in your browser using DataLab