The HiddenMarkov package calls the associated functions belonging to the specified probability distribution in a generic way. For example, if the argument distn
in dthmm
is "xyz"
, it will expect to find functions pxyz
, dxyz
, and Mstep.xyz
. And if simulations are to be performed, it will require rxyz
. In this section we describe the required format for the distribution related functions pxyz
, dxyz
, and rxyz
; and for the function Mstep.xyz
required for the M-step in the EM algorithm.
Consider the examples below of distribution related functions and their arguments. Note that the probability functions all have a first argument of q
, and the last two arguments are all the same, with the same default values. Similarly, the density functions have a first argument of x
, and the last argument is the same, with the same defaults. The arguments in the middle are peculiar to the given distribution, one argument for each distribution parameter. Note that the observed process x
is univariate.
pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
pbeta(q, shape1, shape2, ncp = 0, lower.tail = TRUE, log.p = FALSE)
ppois(q, lambda, lower.tail = TRUE, log.p = FALSE)
pbinom(q, size, prob, lower.tail = TRUE, log.p = FALSE)
dnorm(x, mean = 0, sd = 1, log = FALSE)
dbeta(x, shape1, shape2, ncp = 0, log = FALSE)
dpois(x, lambda, log = FALSE)
dbinom(x, size, prob, log = FALSE)
rnorm(n, mean = 0, sd = 1)
rbeta(n, shape1, shape2, ncp = 0)
rpois(n, lambda)
rbinom(n, size, prob)
The functions pxyz
(distribution function), dxyz
(density) and rxyz
(random number generator) must be consistent with the conventions used in the above examples. The software will deduce the distribution argument names from what is specified in pm
and pn
, and it will call these functions assuming that their argument list is consistent with those described above. The functions pxyz
and dxyz
are used in the forward and backward equations.
The functions dxyz
, pxyz
and rxyz
must also behave in the same vectorised way as dnorm
. For example, if x
is a vector, and mean
and sd
are scalars, then dnorm(x, mean, sd)
calculates the density for each element in x
using the scalar values of mean
and sd
; thus the returned value is the same length as x
. Alternatively, if x
is a scalar and mean
and sd
are vectors, both of the same length, then the returned value is the same length as mean
and is the density of x
evaluated at the corresponding pairs of values of mean
and sd
. The third possibility is that x
and one of the distribution parameters, say sd
, are vectors of the same length, and mu
is a scalar. Then the returned vector will be of the same length as x
, where the \(i\)th value is the density at x[i]
with mean mean
and standard deviation sd[i]
. Note that the functions for the Multinomial
distribution do not have this behaviour. Here the vector x
contains the counts for one multinomial experiment, so the vector is used to characterise the multivariate character of the random variable rather than multiple univariate realisations. Further, the distribution parameters (i.e. category probabilities) are characterised as one vector rather than a sequence of separate function arguments.
The other calculation, that is specific to the chosen distribution, is the maximisation in the M-step. If we have distn="xyz"
, then there should be a function called Mstep.xyz
. Further, it should have arguments (x, cond, pm, pn)
; see for example Mstep.norm
. The parameters that are estimated within this function are named in a consistent way with those that are defined within the dthmm
arguments pm
and pn
. Notice that the estimates of mean
and sd
in Mstep.norm
are weighted by cond$u
. The calculations for cond$u
are performed in the E-step, and utilise the distribution related functions "dxyz"
and "pxyz"
. The values of cond$u
are essentially probabilities that the process belongs to the given Markov state, hence, when we calculate the distributional parameters (like mu
and sd
in Mstep.norm
) we calculate weighted sums using the probabilities cond$u
. This procedure can be shown to give the maximum likelihood estimates of mu
and sd
, and hence a similar weighting should be used for the distribution "xyz"
(see Harte, 2006, for further mathematical detail). One needs to take a little more care when dealing with a distributions like the beta, where the cross derivatives of the log likelihood between the parameters, i.e. \(\partial^2 \log L /(\partial \alpha_1 \partial \alpha_2)\) are non-zero. See Mstep.beta
for further details.
Now consider a situation where we want to modify the way in which a normal distribution is fitted. Say we know the Markov dependent means, and we only want to estimate the standard deviations. Since both parameters are Markov dependent, they both need to be specified in the pm
argument of dthmm
. The estimation of the distribution specific parameters takes place in the M-step, in this case Mstep.norm
. To achieve what we want, we need to modify this function. In this case it is relatively easy (see code in “Examples” below). From the function Mstep.norm
, take the code under the section if (all(nms==c("mean", "sd")))
, i.e. both of the parameters are Markov dependent. However, replace the line where the mean is estimated to mean <- pm$mean
, i.e. leave it as was initially specified. Unfortunately, one cannot easily modify the functions in a package namespace. The simple work-around here is to define a new distribution, say "xyz"
, then define a new function with the above changes called Mstep.xyz
. However, the distribution related functions are just the same as those for the normal distribution, hence, define them as follows:
rxyz <- rnorm
dxyz <- dnorm
pxyz <- pnorm
qxyz <- qnorm
See the 2nd example below for full details.