Learn R Programming

probsvm (version 1.00)

probsvm: Main function that provides models for multiclass conditional probability estimation and label prediction

Description

The function uses x and y to build a multiclass prediction model. The multiclass method can be either one-versus-one (ovo), or one-versus-rest (ovr, default). The probability estimation method for a binary classifier is proposed in Wang et al. (2008), and then we use the method proposed in Wu et al. (2004) to combine the results from binary classifiers in the ovo method. For the ovr method, we rescale those binary probabilities for multiclass problems. The function automatically chooses the best penalty parameter via 5-fold (default) Cross Validations (CV). Linear kernel, polynomial kernel and radial(Gaussian) kernel are available. A solution path is provided by Shin et al. (2012), which boosts the computational speed.

Usage

probsvm(x, y, fold=5, kernel=c("linear","polynomial","radial"), kparam=NULL, Inum=20, type="ovr", lambdas=2^(-10:10))

Arguments

x
The x matrix/data.frame for the training dataset. Columns represent the covariates, and rows represent the instances. There should be no NA/NaN values in x.
y
The labels for the training dataset.
fold
Number of folds in CV. Default 5.
kernel
Type of kernel used for learning. kernel="linear" for linear kernel (default), kernel="polynomial" for polynomial kernel, and kernel="radial" for radial(gaussian) kernel.
kparam
The parameter for the kernel. For linear kernel, this argument is not needed. In polynomial kernel, it represents the degree of the polynomials, and in radial(gaussian) kernel, it represents the usual sigma value.
Inum
Number of knots on [0,1] to estimate the class conditional probability. The larger Inum is, the more accurate the final result is, yet the more time it takes to compute. Default 20.
type
The type of multiclass method. The option ovo is for the one-versus-one method, and ovr is for the one-versus-rest method (default).
lambdas
The user-specified lambda value vector. Each element should be positive. Default 2^(-10:10).

Value

All arguments
All arguments are returned.
lambdas
The lambda values used for selecting the best model. Sorted in an increasing order.
best.lambda
The best lambda values selected from CV. Used for model prediction.
call
The call of probsvm.

References

Shin, S.J., Y. Wu, and H.H. Zhang (2012). Two-Dimensional Solution Surface for Weighted Support Vector Machines, Journal of Computational and Graphical Statistics, in press.

Wang, J., X. Shen, and Y. Liu (2008). Probability estimation for large margin classifiers. Biometrika 95(1), 149-167.

Wu, T.-F., C.-J. Lin, and R. C. Weng (2004). Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research 5, 975-1005.

See Also

predict.probsvm

Examples

Run this code
# iris data #

data(iris)

iris.x=iris[c(1:20,51:70,101:120),-5]  
 
iris.y=iris[c(1:20,51:70,101:120),5]

iris.test=iris[c(21:50,71:100,121:150),-5]  
 
a = probsvm(iris.x,iris.y,type="ovo",
	Inum=10,fold=2,lambdas=2^seq(-10,10,by=3))
predict(a, iris.test)

Run the code above in your browser using DataLab