Bayesian inference is often tied to decision theory (Bernardo and
Smith, 2000), and decision theory has long been considered the
foundations of statistics (Savage, 1954).
Before using the LossMatrix
function, the user should have
already considered all possible actions (choices), states of the world
(outcomes unknown at the time of decision-making), chosen a loss
function \(L(\theta, \alpha)\), estimated loss, and
elicited prior probabilities \(p(\theta | x)\).
Although possible actions (choices) for the decision-maker and
possible states (outcomes) may be continuous or discrete, the loss
matrix is used for discrete actions and states. An example of a
continuous action may be that a decision-maker has already decided to
invest, and the remaining, current decision is how much to invest.
Likewise, an example of continuous states of the world (outcomes) may
be how much profit or loss may occur after a given continuous unit of
time.
The coded example provided below is taken from Berger (1985, p. 6-7)
and described here. The set of possible actions for a decision-maker
is to invest in bond ZZZ or alternatively in bond XXX, as it is called
here. A real-world decision should include a mutually exhaustive list
of actions, such as investing in neither, but perhaps the
decision-maker has already decided to invest and narrowed the options
down to these two bonds.
The possible states of the world (outcomes unknown at the time of
decision-making) are considered to be two states: either the chosen
bond will not default or it will default. Here, the loss function is
a negative linear identity of money, and hence a loss in element
L[1,1]
of -500 is a profit of 500, while a loss in
L[2,1]
of 1,000 is a loss of 1,000.
The decision-maker's dilemma is that bond ZZZ may return a higher
profit than bond XXX, however there is an estimated 10% chance, the
prior probability, that bond ZZZ will default and return a substantial
loss. In contrast, bond XXX is considered to be a sure-thing and
return a steady but smaller profit. The Bayes action is to choose the
first action and invest in bond ZZZ, because it minimizes expected
loss, even though there is a chance of default.
A more realistic application of a loss matrix may be to replace the
point-estimates of loss with samples given uncertainty around the
estimated loss, and replace the point-estimates of the prior
probability of each state with samples given the uncertainty of the
probability of each state. The loss function used in the example is
intuitive, but a more popular monetary loss function may be
\(-\log(E(W | R))\), the negative log of the
expectation of wealth, given the return. There are many alternative
loss functions.
Although isolated decision-theoretic problems exist such as the
provided example, decision theory may also be applied to the results
of a probability model (such as from
IterativeQuadrature
, LaplaceApproximation
,
LaplacesDemon
, PMC
), or
VariationalBayes
, contingent on how
a decision-maker is considering to use the information from the
model. The statistician may pass the results of a model to a client,
who then considers choosing possible actions, given this
information. The statistician should further assist the client with
considering actions, states of the world, then loss functions, and
finally eliciting the client's prior probabilities (such as with the
elicit
function).
When the outcome is finally observed, the information from this
outcome may be used to refine the priors of the next such decision. In
this way, Bayesian learning occurs.