Learn R Programming

LongituRF (version 0.9)

MERT: (S)MERT algorithm

Description

(S)MERT is an adaptation of the random forest regression method to longitudinal data introduced by Hajjem et. al. (2011) <doi:10.1016/j.spl.2010.12.003>. The model has been improved by Capitaine et. al. (2020) <doi:10.1177/0962280220946080> with the addition of a stochastic process. The algorithm will estimate the parameters of the following semi-parametric stochastic mixed-effects model: $$Y_i(t)=f(X_i(t))+Z_i(t)\beta_i + \omega_i(t)+\epsilon_i$$ with \(Y_i(t)\) the output at time \(t\) for the \(i\)th individual; \(X_i(t)\) the input predictors (fixed effects) at time \(t\) for the \(i\)th individual; \(Z_i(t)\) are the random effects at time \(t\) for the \(i\)th individual; \(\omega_i(t)\) is the stochastic process at time \(t\) for the \(i\)th individual which model the serial correlations of the output measurements; \(\epsilon_i\) is the residual error.

Usage

MERT(X, Y, id, Z, iter = 100, time, sto, delta = 0.001)

Arguments

X

[matrix]: A Nxp matrix containing the p predictors of the fixed effects, column codes for a predictor.

Y

[vector]: A vector containing the output trajectories.

id

[vector]: Is the vector of the identifiers for the different trajectories.

Z

[matrix]: A Nxq matrix containing the q predictor of the random effects.

iter

[numeric]: Maximal number of iterations of the algorithm. The default is set to iter=100

time

[vector]: Is the vector of the measurement times associated with the trajectories in Y,Z and X.

sto

[character]: Defines the covariance function of the stochastic process, can be either "none" for no stochastic process, "BM" for Brownian motion, OrnUhl for standard Ornstein-Uhlenbeck process, BBridge for Brownian Bridge, fbm for Fractional Brownian motion; can also be a function defined by the user.

delta

[numeric]: The algorithm stops when the difference in log likelihood between two iterations is smaller than delta. The default value is set to O.O01

Value

A fitted (S)MERF model which is a list of the following elements:

  • forest: Tree obtained at the last iteration.

  • random_effects : Predictions of random effects for different trajectories.

  • id_btilde: Identifiers of individuals associated with the predictions random_effects.

  • var_random_effects: Estimation of the variance covariance matrix of random effects.

  • sigma_sto: Estimation of the volatility parameter of the stochastic process.

  • sigma: Estimation of the residual variance parameter.

  • time: The vector of the measurement times associated with the trajectories in Y,Z and X.

  • sto: Stochastic process used in the model.

  • Vraisemblance: Log-likelihood of the different iterations.

  • id: Vector of the identifiers for the different trajectories.

Examples

Run this code
# NOT RUN {
set.seed(123)
data <- DataLongGenerator(n=20) # Generate the data composed by n=20 individuals.
# Train a SMERF model on the generated data. Should take ~ 50 secondes
# The data are generated with a Brownian motion,
# so we use the parameter sto="BM" to specify a Brownian motion as stochastic process
smert <- MERF(X=data$X,Y=data$Y,Z=data$Z,id=data$id,time=data$time,sto="BM")
smert$forest # is the fitted random forest (obtained at the last iteration).
smert$random_effects # are the predicted random effects for each individual.
smert$omega # are the predicted stochastic processes.
plot(smert$Vraisemblance) #evolution of the log-likelihood.


# }

Run the code above in your browser using DataLab