After fitting a mix
or depmix
model, one is often interested
in determining the most probable mixture components or hidden states at each
time-point t. This is also called decoding the hidden states from the observed
data. There are at least two general ways to consider state classification:
'global' decoding means determining the most likely state sequence, whilst
'local' decoding means determining the most likely state at each time point
whilst not explicitly considering the identity of the hidden states at other
time points. For mixture models, both forms of decoding are identical.
Global decoding is based on the conditional probability
\(p(S_1, \ldots, S_T \mid Y_1, \ldots, Y_T)\), and consists of determining,
at each time point \(t = 1, \ldots, T\):
$$s*_t = \arg \max_{i=1}^N p(S_1 = s*_1, \ldots, S_{t-1} = s*_{t-1}, S_t = i, S_{t+1} = s*_{t+1}, \ldots, S_T = s*_{T} \mid Y_1, \ldots, Y_T)$$
where N is the total number of states. These probabilities and the
resulting classifications, are computed through the viterbi
algorithm.
Setting type = 'viterbi'
returns a data.frame
with the Viterbi-decoded
global state sequence in the first column, and the normalized "delta" probabilities
in the remainining columns. These "delta" probabilities are defined as the joint
probability of the most likely state sequence ending in state i at time t,
and all the observations up to time t. The normalization of these joint
probabilities is done on a time-point basis (i.e., dividing the delta probability
by the sum of the delta probabilities for that time point for all possible states
j (including state i)). These probabilities are not straightforward
to interpret. Setting type = "global"
returns just a vector with the
Viterbi-decoded global state sequence.
Local decoding is based on the smoothing probabilities
\(p(S_t \mid Y_1, \ldots, Y_T)\), which are the "gamma" probabilities
computed with the forwardbackward
algorithm. Local decoding then
consists of determining, at each time point \(t = 1, \ldots, T\)
$$s*_t = \arg \max_{i=1}^N p(S_t = i \mid Y_1, \ldots, Y_T)$$
where N is the total number of states. Setting type = "local"
returns
a vector with the local decoded states. Setting type = "smoothing"
returns
the smoothing probabilities which underlie this classification. When considering
the posterior probability of each state, the values returned by type = "smoothing"
are most likely what is wanted by the user.
The option type = "filtering"
returns a matrix with the so-called filtering probabilities,
defined as \(p(S_t \mid Y_1, \ldots, Y_t)\), i.e. the probability of a hidden
state at time t considering the observations up to and including time t.
See the fit
help page for an example.