Utilities: Utility functions

Description

Mixed utility functions to compute accuracy, norms, labels from scores and to perform stratified cross-validation.

Usage

compute.acc(pred, labels)
compute.F(pred, labels)
norm1(x)
Unit.sphere.norm(K)
do.stratified.cv.data(examples, positives, k = 5, seed = NULL)
do.cv.data(examples, positives, k = 5, seed = NULL)
labelsfromscores(scores, thresh)
Multiple.labels.from.scores(S, thresh.vect)
selection.test(pos.scores, av.scores, ind.positives, alpha = 0.05, thresh.pos = 0)

Value

compute.acc returns the accuracy

compute.F returns the F-score

norm1 returns the L1-norm value

Unit.sphere.norm returns the kernel normalized according to the unit sphere

do.stratified.cv.data returns a list with 2 two components:

fold.non.positives: a list with k components. Each component is a vector with the indices of the non positive elements of the fold
fold.positives: a list with k components. Each component is a vector with the indices of the positive elements of the fold

Indices refer to row numbers of the data matrix

do.cv.data returns a list with 2 two components:

fold.non.positives: a list with k components. Each component is a vector with the indices of the non positive elements of the fold
fold.positives: a list with k components. Each component is a vector with the indices of the positive elements of the fold

Indices refer to row numbers of the data matrix

labelsfromscores returns a numeric vector res with 0 or 1 values. The label res[i]=1 if scores[i]>thresh, otherwise res[i]=0

Multiple.labels.from.scores returns a binary matrix with the labels of the predictions. Rows represent examples, columns classes. Element L[i,j] is the label of example i w.r.t. class j. L[i,j]=1 if i belongs to j, 0 otherwise.

selection.test returns a list with 5 components:

selected: a named vector with the components of av.scores selected by the test
selected.labeled: a named vector with the labeled components of av.scores selected by the test
selected.unlabeled: a named vector with the unlabeled components of av.scores selected by the test
thresh: the score threshold selected by the test
alpha: significance level (the same value of the input)

Arguments

pred: vector of the predicted labels
labels: vector of the true labels. Note that 0 stands for negative and 1 for positive. In general the first level is negative and the second positive
x: numeric vector
K: a kernel matrix
examples: indices of the examples (a vector of integer)
positives: vector of integer. Indices of the positive examples. The indices refer to the indices of examples
k: number of folds (def = 5)
seed: seed of the random generator (def=NULL). If is set to NULL no initiazitation is performed
scores: numeric. Vector of scores: each element correspond to the score of an example
thresh: real value. Threshold for the classification
S: numeric matrix. Matrix of scores: rows represent examples, columns classes
thresh.vect: numeric vector. Vector of the thresholds for multiple classes (one threshold for each class)
pos.scores: vector with scores of positive examples. It is returned from multiple.ker.score.cv.
av.scores: a vector with the average scores computed by multiple.ker.score.cv. It may be a named vector. If not, the names attributes corresponding to the indices of the vector are added.
ind.positives: indices of the positive examples. They are the indices of av.scores corresponding to positive examples.
alpha: quantile level (def. 0.05)
thresh.pos: only values larger than thresh.pos are retained in pos.scores (def.: 0)

Details

compute.acc computes the accuracy for a single class

compute.F computes the F-score for a single class

norm1 computes the L1-norm of a numeric vector

Unit.sphere.norm normalize a kernel according to the unit sphere

do.stratified.cv.data generates data for the stratified cross-validation. In particular subdivides the indices that refer to the rows of the data matrix in different folds (separated for positive and negative examples)

do.cv.data generates data for the vanilla not stratified cross-validation.

labelsfromscores computes the labels of a single class from the corresponding scores

Multiple.labels.from.scores computes the labels of multiple classes from the corresponding scores

selection.test is a non parametric test to select the most significant unlabeled examples

Examples

Run this code

# L1-norm of a vector
norm1(rnorm(10));
# generation of 5 stratified folds;
do.stratified.cv.data(1:100, 1:10, k = 5, seed = NULL);
# generation of labels form scores.
labelsfromscores(runif(20), thresh=0.3);

Run the code above in your browser using DataLab