buildindex(basename,reference,gappedIndex=TRUE,indexSplit=TRUE,memory=8000,
TH_subread=100,colorspace=FALSE)
FALSE
, 16mers (subreads) will be extracted from every chromosomal location of a reference genome and then they will be used to build a hash table index. By default(TRUE
), subreads are extracted in every three bases from the genome.TRUE
, the built index is allowed to be splitted into multiple segments. The number of such segments is determined by memory
value, genome size and permitting of gaps between subreads(gappedIndex
). If indexSplit
is set to FALSE
, a single-segment index (no splitting) will be generated regardless of what value is chosen for memory
.TRUE
, a color space index will be built. Otherwise, a base space index will be built.gappedIndex
is set to FALSE
, then subreads will be extracted from every chromosomal location of genome for index building.
The built index can then be used by Subread (align
) and subjunc
aligners to map reads(Liao et al. 2013).Highly repetitive subreads (or uninformative subreads) are excluded from the hash table so as to reduce mapping ambiguity.
TH_subread
specifies the maximal number of times a subread is allowed to occur in the reference genome to be included in hash table.
The built index might be splitted into multiple segments if its size is greater than memory
value.
The number of such segments is dependent on memory
value, size of reference genome and whether gaps are allowed between subreads extracted from genome.
Only one segment is loaded into memory at any time when read alignment is being carried out.
The larger the memory
value, the faster the read mapping will be.
If indexSplit
is set to FALSE
, the index will not be splitted and this will enable maximum mapping speed to be achieved.
The index needs to be built only once and it can then be re-used in the subsequent alignments.
# Build an index for the artifical sequence included in file 'reference.fa'
library(Rsubread)
ref <- system.file("extdata","reference.fa",package="Rsubread")
buildindex(basename="./reference_index",reference=ref)
Run the code above in your browser using DataLab