Learn R Programming

propOverlap (version 1.0)

GMask: Producing Gene Masks.

Description

GMask produces the masks of features (genes). Each gene mask reports the samples that can unambiguously be assigned to their correct target classes by this gene.

Usage

GMask(ES, Core, Y)

Arguments

ES
gene (feature) matrix: P, number of genes, by N, number of samples(observations).
Core
a data.frame of the core interval boundaries for both classes. It should have the same number of rows as ES and 4 columns (the minimum and the maximum of the first class's core interval followed by the minimum and the maximum of the second class's core interval). See the returned value of the CI.emprical.
Y
a vector of length N for samples' class label.

Value

It returns a P by N matrix with elements of zeros and ones.

Details

GMask gives the gene masks that can represent the capability of genes to correctly classify each sample. Such a mask represents a gene's classification power. Each element of a mask is set either to 1 or 0 based on whether the corresponding sample (observation) could be unambiguously assign to its correct target class by the considered gene or not respectively.

References

Mahmoud O., Harrison A., Perperoglou A., Gul A., Khan Z., Metodiev M. and Lausen B. (2014) A feature selection method for classification within functional genomics experiments based on the proportional overlapping score. BMC Bioinformatics, 2014, 15:274.

See Also

CI.emprical for the core interval boundaries.

Examples

Run this code
data(leukaemia)
GenesExpression <- leukaemia[1:7129,] #define the features matrix
Class           <- leukaemia[7130,]   #define the observations' class labels
Gene.Masks      <- GMask(GenesExpression, CI.emprical(GenesExpression, Class), Class)
Gene.Masks[1:100,]                    #show the masks of the first 100 features

Run the code above in your browser using DataLab