This function can compute Fickett TESTCODE score of DNA sequences proposed by James W.Fickett (Fickett JW. 1982). Fickett TESTCODE score can be calculated on full sequence or the longest ORF region.
compute_FickettScore(
Sequences,
label = NULL,
on.ORF = FALSE,
auto.full = FALSE,
parallel.cores = 2
)
A dataframe.
A FASTA file loaded by function read.fasta
of
seqinr-package
.
Optional. String. Indicate the label of the sequences such as "NonCoding", "Coding".
Logical. If TRUE
, Fickett TESTCODE score will be calculated on
the longest ORF region.
Logical. When on.ORF = TRUE
but no ORF can be found,
if auto.full = TRUE
, Fickett TESTCODE score will be calculated on full sequences automatically;
if auto.full
is FALSE
, the sequences that have no ORF will be discarded.
Ignored when on.ORF = FALSE
. (Default: FALSE
)
Integer. The number of cores for parallel computation.
By default the number of cores is 2
. Users can set as -1
to run
this function with all cores.
HAN Siyu
This function can compute Fickett TESTCODE score proposed by James W.Fickett (Fickett JW. 1982).
Fickett TESTCODE score is selected as feature by method CPAT (Wang et al. 2013) and CPC2 (Kang et al. 2017).
In CPAT, Fickett TESTCODE score is calculated on the longest ORF region, but CPC2 calculates the score
on full sequence. This function compute_FickettScore
improves the CPAT's code
and is capable of computing the score on the longest ORF region as well as full sequence.
James W.Fickett. Recognition of protein coding regions in DNA sequences. Nucleic Acids Research, 1982, 10(17):5303-5318.
Siyu Han, Yanchun Liang, Qin Ma, Yangyi Xu, Yu Zhang, Wei Du, Cankun Wang & Ying Li. LncFinder: an integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information, and physicochemical property. Briefings in Bioinformatics, 2019, 20(6):2009-2027.
Liguo Wang, Hyun Jung Park, Surendra Dasari, Shengqin Wang, JeanPierre Kocher & Wei Li. CPAT: coding-potential assessment tool using an alignment-free logistic regression model. Nucleic Acids Research, 2013, 41(6):e74-e74.
Yu-Jian Kang, De-Chang Yang, Lei Kong, Mei Hou, Yu-Qi Meng, Liping Wei & Ge Gao. CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features. Nucleic Acids Research, 2017, 45(W1):W12-W16.
if (FALSE) {
data(demo_DNA.seq)
Seqs <- demo_DNA.seq
FickettScore <- compute_FickettScore(Seqs, label = NULL, on.ORF = TRUE,
auto.full = TRUE, parallel.cores = 2)
}
Run the code above in your browser using DataLab