Qlearning: Q-learning

Description

This funciton impletment multiple stage Q-learning.

Usage

Qlearning(X,AA,RR,K,pentype="lasso",m=4)

Arguments

is either a matrix shared among all stages; or a list of feature matrices, where feature matrices from different stages can have different dimensions.

a list of K, each element A[[i]] is the vector of treatment assignments for stage i.

a list of K, each element R[[i]] is the outcome vector for stage i.

number of stages

pentype

the type of regression implemented in Q-learning, the default is 'lasso', another choice is 'LSE'

number of folds of cross validation for in cv.glmnet in regression model when 'lasso' is selected

Value

it returns a list of K models with class 'qlearn'.

References

Watkins, C. J. C. H. (1989). Learning from delayed rewards (Doctoral dissertation, University of Cambridge).

Murphy, S. A., Oslin, D. W., Rush, A. J., & Zhu, J. (2007). Methodological challenges in constructing effective treatment sequences for chronic psychiatric disorders. Neuropsychopharmacology, 32(2), 257-262.

Zhao, Y., Kosorok, M. R., & Zeng, D. (2009). Reinforcement learning design for cancer clinical trials. Statistics in medicine, 28(26), 3294.

Examples

Run this code

# NOT RUN {
n_cluster=10
pinfo=10
pnoise=20
example2=make_2classification(n_cluster,pinfo,pnoise,200)
test=make_2classification(n_cluster,pinfo,pnoise,200,example2$centroids)
pi=list()
pi[[2]]=pi[[1]]=rep(1,200)
modelQ=Qlearning(example2$X,example2$A,example2$R,2)
# }