HMM_based_method: Hidden Markov Method for Predicting Physical Activity Patterns

Description

This function assigns a physical activity range to each observation of a time-series (such as a sequence of impulse counts recorded by an accelerometer) using hidden Markov models (HMM). The activity ranges are defined by thresholds called cut-off points. Basically, this function combines HMM_training, HMM_decoding and cut_off_point_method. See Details for further information.

Usage

HMM_based_method(
  x,
  cut_points,
  distribution_class,
  min_m = 2,
  max_m = 6,
  n = 100,
  max_scaled_x = NA,
  names_activity_ranges = NA,
  discr_logL = FALSE,
  discr_logL_eps = 0.5,
  dynamical_selection = TRUE,
  training_method = "EM",
  Mstep_numerical = FALSE,
  BW_max_iter = 50,
  BW_limit_accuracy = 0.001,
  BW_print = TRUE,
  DNM_max_iter = 50,
  DNM_limit_accuracy = 0.001,
  DNM_print = 2,
  decoding_method = "global",
  bout_lengths = NULL,
  plotting = 0
)

Value

HMM_based_method returns a list containing the output of the trained hidden Markov model, including the selected number of states m (i.e., number of physical activities) and plots key figures.

trained_HMM_with_selected_m: a list object containing the trained hidden Markov model including the selected number of states m (see HMM_training for further details).
decoding: a list object containing the output of the decoding (see HMM_decoding for further details)
extendend_cut_off_point_method: a list object containing the output of the cut-off point method. The counts x are classified into the activity ranges by the corresponding sequence of hidden PA-levels, which were decoded by the HMM (see cut_off_point_method for further details).

Arguments

x: a vector object of length T containing non-negative observations of a time-series, such as a sequence of accelerometer impulse counts, which are assumed to be realizations of the (hidden Markov state dependent) observation process of a HMM.
cut_points: a vector object containing cut-off points to separate activity ranges. For instance, the vector c(7,15,23) separates the four activity ranges [0,7), [7,15), [15,23) and [23,Inf).
distribution_class: a single character string object with the abbreviated name of the m observation distributions of the Markov dependent observation process. The following distributions are supported: Poisson (pois); generalized Poisson (genpois); normal (norm)).
min_m: miminum number of hidden states in the hidden Markov chain. Default value is 2.
max_m: maximum number of hidden states in the hidden Markov chain. Default value is 6.
n: a single numerical value specifying the number of samples. Default value is 100.
max_scaled_x: an optional numerical value, to be used to scale the observations of the time-series x before the hidden Markov model is trained and decoded (see Details). Default value is NA.
names_activity_ranges: an optional character string vector to name the activity ranges induced by the cut-points. This vector must contain one element more than the vector cut_points.
discr_logL: a logical object indicating whether the discrete log-likelihood should be used (for "norm") for estimating the model specific parameters instead of the general log-likelihood. See MacDonald & Zucchini (2009, Paragraph 1.2.3) for further details. Default is FALSE.
discr_logL_eps: a single numerical value to approximate the discrete log-likelihood for a hidden Markov model based on nomal distributions (for distribution_class="norm"). The default value is 0.5.
dynamical_selection: a logical value indicating whether the method of dynamical initial parameter selection should be applied (see HMM_training for details). Default is TRUE.
training_method: a logical value indicating whether the Baum-Welch algorithm ("EM") or the method of direct numerical maximization ("numerical") should be applied for estimating the model specific parameters of the HMM. See Baum_Welch_algorithm and direct_numerical_maximization for further details. Default is training_method = "EM".
Mstep_numerical: a logical object indicating whether the Maximization Step of the Baum-Welch algorithm shall be performed by numerical maximization. Default is FALSE.
BW_max_iter: a single numerical value representing the maximum number of iterations in the Baum-Welch algorithm. Default value is 50.
BW_limit_accuracy: a single numerical value representing the convergence criterion of the Baum-Welch algorithm. Default value is 0.001.
BW_print: a logical object indicating whether the log-likelihood at each iteration-step shall be printed. Default is TRUE.
DNM_max_iter: a single numerical value representing the maximum number of iterations of the numerical maximization using the nlm-function (used to perform the M-step of the Baum-Welch-algorithm). Default value is 50.
DNM_limit_accuracy: a single numerical value representing the convergence criterion of the numerical maximization algorithm using the nlm function (used to perform the M-step of the Baum-Welch-algorithm). Default value is 0.001.
DNM_print: a single numerical value to determine the level of printing of the nlm-function. See nlm-function for further informations. The value 0 suppresses, that no printing will be outputted. Default value is 2 for full printing.
decoding_method: a string object to choose the applied decoding-method to decode the HMM given the time-series of observations x. Possible values are "global" (for the use of the Viterbi_algorithm) and "local" (for the use of the local_decoding_algorithm). Default value is "global".
bout_lengths: a vector object (with even number of elemets) to define the range of the bout intervals (see Details for the definition of bouts). For instance, bout_lengths = c(1,1,2,2,3,10,11,20,1,20) defines the five bout intervals [1,1] (1 count); [2,2] (2 counts); [3,10] (3-10 counts); [11,20] (11-20 counts); [1,20] (1-20 counts - overlapping with other bout intervalls is possible). Default value is bout_lengths=NULL.
plotting: a numeric value between 0 and 5 (generates different outputs). NA suppresses graphical output. Default value is 0.
0: output 1-5
1: summary of all results
2: time series of activity counts, classified into activity ranges
3: time series of bouts (and, if available, the sequence of the estimated hidden physical activity levels, extracted by decoding a trained HMM, in green colour)
4: barplots of absolute and relative frequencies of time spent in different activity ranges
5: barplots of relative frequencies of the lenghts of bout intervals (overall and by activity ranges )

Author

Vitali Witowski (2013).

Details

The function combines HMM_training, HMM_decoding and cut_off_point_method as follows:

Step 1: HMM_training trains the most likely HMM for a given time-series of accelerometer counts.
Step 2: HMM_decoding decodes the trained HMM (Step 1) into the most likely sequence of hidden states corresponding to the given time-series of observations (respectively the most likely sequence of physical activity levels corresponding to the time-series of accelerometer counts).
Step 3. cut_off_point_method assigns an activity range to each accelerometer count by its hidden physical activity level (extracted in Step 2).

References

Brachmann, B. (2011). Hidden-Markov-Modelle fuer Akzelerometerdaten. Diploma Thesis, University Bremen - Bremen Institute for Prevention Research and Social Medicine (BIPS).

MacDonald, I. L., Zucchini, W. (2009) Hidden Markov Models for Time Series: An Introduction Using R, Boca Raton: Chapman & Hall.

Witowski, V., Foraita, R., Pitsiladis, Y., Pigeot, I., Wirsik, N. (2014) Using hidden Markov models to improve quantifying physical activity in accelerometer data - A simulation study. PLOS ONE. 9(12), e114089. tools:::Rd_expr_doi("10.1371/journal.pone.0114089")

Examples

Run this code

x <- c(1,16,19,34,22,6,3,5,6,3,4,1,4,3,5,7,9,8,11,11,
  14,16,13,11,11,10,12,19,23,25,24,23,20,21,22,22,18,7,
  5,3,4,3,2,3,4,5,4,2,1,3,4,5,4,5,3,5,6,4,3,6,4,8,9,12,
  9,14,17,15,25,23,25,35,29,36,34,36,29,41,42,39,40,43,
  37,36,20,20,21,22,23,26,27,28,25,28,24,21,25,21,20,21,
  11,18,19,20,21,13,19,18,20,7,18,8,15,17,16,13,10,4,9,
  7,8,10,9,11,9,11,10,12,12,5,13,4,6,6,13,8,9,10,13,13,
  11,10,5,3,3,4,9,6,8,3,5,3,2,2,1,3,5,11,2,3,5,6,9,8,5,
  2,5,3,4,6,4,8,15,12,16,20,18,23,18,19,24,23,24,21,26,
  36,38,37,39,45,42,41,37,38,38,35,37,35,31,32,30,20,39,
  40,33,32,35,34,36,34,32,33,27,28,25,22,17,18,16,10,9,
  5,12,7,8,8,9,19,21,24,20,23,19,17,18,17,22,11,12,3,9,
  10,4,5,13,3,5,6,3,5,4,2,5,1,2,4,4,3,2,1) 

# Assumptions (number of states, probability vector,
# transition matrix, and distribution parameters)

m <- 4
delta <- c(0.25, 0.25, 0.25, 0.25)
gamma <- 0.7 * diag(m) + rep(0.3 / m)
distribution_class <- "pois"
distribution_theta <- list(lambda = c(4, 9, 17, 25))

Run the code above in your browser using DataLab