sc3: SC3 main function

Description

Run SC3 clustering pipeline and starts an interactive session in a web browser.

Usage

sc3(filename, ks = 3:7, cell.filter = FALSE, cell.filter.genes = 2000, gene.filter = TRUE, gene.filter.fraction = 0.06, log.scale = TRUE, d.region.min = 0.04, d.region.max = 0.07, interactivity = TRUE, show.original.labels = FALSE, svm = FALSE, svm.num.cells = NA, n.cores = NA, seed = 1)

Arguments

filename

either an R matrix / data.frame object OR a path to your input file containing an input expression matrix. The expression matrix must contain both colnames (cell IDs) and rownames (gene IDs).

a range of the number of clusters that needs to be tested. k.min is the minimum number of clusters (default is 3). k.max is the maximum number of clusters (default is 7).

cell.filter

defines whether to filter cells that express less than cell.filter.genes genes (lowly expressed cells). By default it is FALSE. The cell filter should be used if the quality of data is low, i.e. if one suspects that some of the cells may be technical outliers with poor coverage. Filtering of lowly expressed cells usually improves clustering.

cell.filter.genes

if cell.filter is used then this parameter defines the minimum number of genes that have to be expressed in each cell (expression value > 1e-2). If there are fewer, the cell will be removed from the analysis. The default is 2000.

gene.filter

defines whether to perform gene filtering or not. Boolean, default is TRUE.

gene.filter.fraction

fraction of cells (1 - X/100), default is 0.06. The gene filter removes genes that are either expressed or absent (expression value is less than 2) in at least X The motivation for the gene filter is that ubiquitous and rare genes most often are not informative for the clustering.

log.scale

defines whether to perform log2 scaling or not. Boolean, default is TRUE.

d.region.min

the lower boundary of the optimum region of d, default is 0.04.

d.region.max

the upper boundary of the optimum region of d, default is 0.07.

interactivity

defines whether a browser interactive window should be open after all computation is done. By default it is TRUE. This option can be used to separate clustering calculations from visualisation, e.g. long and time-consuming clustering of really big datasets can be run on a farm cluster and visualisations can be done using a personal laptop afterwards. If interactivity is FALSE then all clustering results will be saved as "sc3.interactive.arg" list. To run interactive visulisation with the precomputed clustering results please use sc3_interactive(sc3.interactive.arg).

show.original.labels

if cell labels in the dataset are not unique, but represent clusters expected from the experiment, they can be visualised by setting this parameter to TRUE. The default is FALSE.

svm

if TRUE then an SVM prediction will be used. The default is FALSE.

svm.num.cells

number of training cells to be used for SVM prediction. The default is NA. If the svm parameter is TRUE and svn.num.cells is not provided, then the defaults of SC3 will be used: if number of cells is more than 5000, then svn.num.cells = 1000, otherwise svn.num.cells = 20 percent of the total number of cells

n.cores

defines the number of cores to be used on the user's machine. Default is NA.

seed

sets seed for the random number generator, default is 1. Can be used to check the stability of clustering results: if the results are the same after changing the seed several time, then the clustering solution is stable.

Value

Opens a browser window with an interactive shine app and visualize all precomputed clusterings.

Examples

Run this code

sc3(treutlein, 3:7, interactivity = FALSE, n.cores = 2)

Run the code above in your browser using DataLab