Learn R Programming

SSL (version 0.1)

sslSelfTrain: Self-Training

Description

Self-Training

Usage

sslSelfTrain(xl, yl, xu, n = 10, nrounds, ...)

Arguments

xl
a n * p matrix or data.frame of labeled data
yl
a n * 1 integer vector of labels(begin from 1).
xu
a m * p matrix or data.frame of unlabeled data
n
number of unlabeled examples to add into labeled data in each iteration
nrounds
the maximal number of iterations, see more in xgb.train
...
other parameters

Value

a m * 1 integer vector representing the predictions of unlabeled data.

Details

In self-training a classifier is first trained with the small amount of labeled data using extreme gradient boosting. The classifier is then used to classify the unlabeled data. The most confident unlabeled points, together with their predicted labels, are added to the training set. The classifier is re-trained and the procedure repeats.

References

Rosenberg, C., Hebert, M., & Schneiderman, H. (2005). Semi-supervised selftraining of object detection models. Seventh IEEE Workshop on Applications of Computer Vision.

See Also

xgb.train

Examples

Run this code
data(iris)
xl<-iris[,1:4]
#Suppose we know the first twenty observations of each class
#and we want to predict the remaining with self-training
# 1 setosa, 2 versicolor, 3 virginica
yl<-rep(1:3,each = 20)
known.label <-c(1:20,51:70,101:120)
xu<-xl[-known.label,]
xl<-xl[known.label,]
yu<-sslSelfTrain(xl,yl,xu,nrounds = 100,n=30)

Run the code above in your browser using DataLab