baumWelch: Inferring the parameters of a tree Hidden Markov Model via the Baum-Welch algorithm

Description

For an initial Hidden Markov Model (HMM) with some assumed initial parameters and a given set of observations at all the nodes of the tree, the Baum-Welch algorithm infers optimal parameters to the HMM. Since the Baum-Welch algorithm is a variant of the Expectation-Maximisation algorithm, the algorithm converges to a local solution which might not be the global optimum. Note that if you give the training and validation data, the function will message out AUC and AUPR values after every iteration. Also, validation data must contain more than one instance of either of the possible states

Usage

baumWelch(hmm, observation, kn_states = NULL, kn_verify = NULL,
  maxIterations = 50, delta = 1e-05, pseudoCount = 0)

Arguments

hmm

hmm Object of class List given as output by initHMM

observation

A list consisting "k" vectors for "k" features, each vector being a character series of discrete emmision values at different nodes serially sorted by node number

kn_states

(Optional) A (L * 2) dataframe where L is the number of training nodes where state values are known. First column should be the node number and the second column being the corresponding known state values of the nodes

kn_verify

(Optional) A (L * 2) dataframe where L is the number of validation nodes where state values are known. First column should be the node number and the second column being the corresponding known state values of the nodes

maxIterations

(Optional) The maximum number of iterations in the Baum-Welch algorithm. Default is 100

delta

(Optional) Additional termination condition, if the transition and emission matrices converge, before reaching the maximum number of iterations (maxIterations). The difference of transition and emission parameters in consecutive iterations must be smaller than delta to terminate the algorithm. Default is 1e-9

pseudoCount

(Optional) Adding this amount of pseudo counts in the estimation-step of the Baum-Welch algorithm. Default is zero

Value

List of three elements, first being the infered HMM whose representation is equivalent to the representation in initHMM, second being a list of statistics of algorithm and third being the final state probability distribution at all nodes.

Examples

Run this code

# NOT RUN {
tmat= matrix(c(0,0,1,0,0,0,0,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0),
               5,5, byrow= TRUE ) #for "X" (5 nodes) shaped tree
hmmA= initHMM(c("P","N"),list(c("L","R")), tmat) #one feature with two discrete levels "L" and "R"
obsv= list(c("L","L","R","R","L")) #emissions for the one feature for the 5 nodes in order 1:5
kn_st = data.frame(node=c(2),state=c("P"),stringsAsFactors = FALSE)
                   #state at node 2 is known to be "P"
kn_vr = data.frame(node=c(3,4,5),state=c("P","N","P"),stringsAsFactors = FALSE) 
                   #state at node 3,4,5 are "P","N","P" respectively
learntHMM= baumWelch(hmmA,obsv,kn_st, kn_vr)
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples