Process model to organize nodes for marginalization (integration over latent nodes or random effects) as by Laplace approximation.
setupMargNodes(
model,
paramNodes,
randomEffectsNodes,
calcNodes,
calcNodesOther,
split = TRUE,
check = TRUE,
allowDiscreteLatent = FALSE
)
A list is returned with elements:
paramNodes
: final processed version of paramNodes
randomEffectsNodes
: final processed version of randomEffectsNodes
calcNodes
: final processed version of calcNodes
calcNodesOther
: final processed version of calcNodesOther
givenNodes
: Input to model$getConditionallyIndependentSets
, if split=TRUE
.
randomEffectsSets
: Output from
model$getConditionallyIndependentSets
, if split=TRUE
. This
will be a list of vectors of node names. The node names in one list element
can be marginalized independently from those in other list elements. The
union of the list elements should be all of randomEffectsNodes
. If
split=FALSE
, randomEffectsSets
will be a list with one
element, simply containing randomEffectsNodes
. If split
is a
numeric vector, randomEffectsSets
will be the result of
split
(randomEffectsNodes
, control$split
).
A nimble model such as returned by nimbleModel
.
A character vector of names of stochastic nodes that are
parameters of nodes to be marginalized over (randomEffectsNodes
).
See details for default.
A character vector of nodes to be marginalized over (or "integrated out"). In the case of calculating the likelihood of a model with continuous random effects, the nodes to be marginalized over are the random effects, hence the name of this argument. However, one can marginalize over any nodes desired as long as they are continuous. See details for default.
A character vector of nodes to be calculated as the
integrand for marginalization. Typically this will include
randomEffectsNodes
and some data nodes. Se details for default.
A character vector of nodes to be calculated as part of
the log likelihood that are not connected to the randomEffectNodes
and so are not actually part of the marginalization. These are somewhat
extraneous to the purpose of this function, but it is convenient to handle
them here because often the purpose of marginalization is to calculate log
likelihoods, including from "other" parts of the model.
A logical indicating whether to split randomEffectsNodes
into conditionally independent sets that can be marginalized separately
(TRUE
) or to keep them all in one set for a single marginalization
calculation.
A logical indicating whether to try to give reasonable warnings of badly formed inputs that might be missing important nodes or include unnecessary nodes.
A logical indicating whether to
allow discrete latent states. (default = FALSE
)
Wei Zhang, Perry de Valpine, Paul van Dam-Bates
This function is used by buildLaplace
to organize model nodes into
roles needed for setting up the (approximate) marginalization done by Laplace
approximation. It is also possible to call this function directly and pass
the resulting list (possibly modified for your needs) to buildLaplace
.
Any of the input node vectors, when provided, will be processed using
nodes <- model$expandNodeNames(nodes)
, where nodes
may be
paramNodes
, randomEffectsNodes
, and so on. This step allows
any of the inputs to include node-name-like syntax that might contain
multiple nodes. For example, paramNodes = 'beta[1:10]'
can be
provided if there are actually 10 scalar parameters, 'beta[1]' through
'beta[10]'. The actual node names in the model will be determined by the
exapndNodeNames
step.
This function does not do any of the marginalization calculations. It only organizes nodes into roles of parameters, random effects, integrand calculations, and other log likelihood calculations.
The checking done if `check=TRUE` tries to be reasonable, but it can't cover all cases perfectly. If it gives an unnecessary warning, simply set `check=FALSE`.
If paramNodes
is not provided, its default depends on what other
arguments were provided. If neither randomEffectsNodes
nor
calcNodes
were provided, paramNodes
defaults to all
top-level, stochastic nodes, excluding any posterior predictive nodes
(those with no data anywhere downstream). These are determined by
model$getNodeNames(topOnly = TRUE, stochOnly = TRUE,
includePredictive = FALSE)
. If randomEffectsNodes
was provided,
paramNodes
defaults to stochastic parents of
randomEffectsNodes
. In these cases, any provided calcNodes
or
calcNodesOther
are excluded from default paramNodes
. If
calcNodes
but not randomEffectsNodes
was provided, then the
default for randomEffectsNodes
is determined first, and then
paramNodes
defaults to stochastic parents of
randomEffectsNodes
. Finally, any stochastic parents of
calcNodes
(whether provided or default) that are not in
calcNodes
are added to the default for paramNodes
, but only
after paramNodes
has been used to determine the defaults for
randomEffectsNodes
, if necessary.
Note that to obtain sensible defaults, some nodes must have been marked as
data, either by the data
argument in nimbleModel
or by
model$setData
. Otherwise, all nodes will appear to be posterior
predictive nodes, and the default paramNodes
may be empty.
For purposes of buildLaplace
, paramNodes
does not need to (but
may) include deterministic nodes between the parameters and any
calcNodes
. Such deterministic nodes will be included in
calculations automatically when needed.
If randomEffectsNodes
is missing, the default is a bit complicated: it
includes all latent nodes that are descendants (or "downstream") of
paramNodes
(if provided) and are either (i) ancestors (or
"upstream") of data nodes (if calcNodes
is missing), or (ii)
ancestors or elements of calcNodes
(if calcNodes
and
paramNodes
are provided), or (iii) elements of calcNodes
(if
calcNodes
is provided but paramNodes
is missing). In all
cases, discrete nodes (with warning if check=TRUE
), posterior
predictive nodes and paramNodes
are excluded.
randomEffectsNodes
should only include stochastic nodes.
If calcNodes
is missing, the default is randomEffectsNodes
and
their descendants to the next stochastic nodes, excluding posterior
predictive nodes. These are determined by
model$getDependencies(randomEffectsNodes, includePredictive=FALSE)
.
If calcNodesOther
is missing, the default is all stochastic
descendants of paramNodes
, excluding posterior predictive nodes
(from model$getDependencies(paramNodes, stochOnly=TRUE, self=FALSE,
includePosterior=FALSE)
) that are not part of calcNodes
.
For purposes of buildLaplace
, neither calcNodes
nor
calcNodesOther
needs to (but may) contain deterministic nodes
between paramNodes
and calcNodes
or calcNodesOther
,
respectively. These will be included in calculations automatically when
needed.
If split
is TRUE
, model$getConditionallyIndependentSets
is used to determine sets of the randomEffectsNodes
that can be
independently marginalized. The givenNodes
are the
paramNodes
and calcNodes
excluding any
randomEffectsNodes
and their deterministic descendants. The
nodes
(to be split into sets) are the randomEffectsNodes
.
If split
is a numeric vector, randomEffectsNodes
will be split
by split
(randomEffectsNodes
, control$split
). The last
option allows arbitrary control over how randomEffectsNodes
are
blocked.
If check=TRUE
, then defaults for each of the four categories of nodes
are created even if the corresponding argument was provided. Then warnings
are emitted if there are any extra (potentially unnecessary) nodes provided
compared to the default or if there are any nodes in the default that were
not provided (potentially necessary). These checks are not perfect and may
be simply turned off if you are confident in your inputs.
(If randomEffectsNodes
was provided but calcNodes
was not
provided, the default (for purposes of check=TRUE
only) for
randomEffectsNodes
differs from the above description. It uses
stochastic descendants of randomEffectsNodes
in place of the
"data nodes" when determining ancestors of data nodes. And it uses item
(ii) instead of (iii) in the list above.)