Learn R Programming

GREP2 (version 1.0.2)

run_salmon: Quantify transcript abundances using Salmon

Description

run_salmon is a wrapper function for mapping reads to quantify transcript abundances using Salmon. You need to install Salmon and build index to run this function. For index building see function 'build_index'.

Usage

run_salmon(srr_id, library_layout = c("SINGLE", "PAIRED"), index_dir,
  destdir, fastq_dir, use_trimmed_fastq = FALSE, other_opts = NULL,
  n_thread)

Arguments

srr_id

SRA run accession ID.

library_layout

layout of the library used. Either 'SINGLE' or 'PAIRED'.

index_dir

directory of the indexing files needed for read mapping using Salmon. See function 'build_index'.

destdir

directory where all the results will be saved.

fastq_dir

directory of the fastq files.

use_trimmed_fastq

logical, whether to use trimmed fastq files.

other_opts

other options to use. See Salmon documentation for the available options.

n_thread

number of cores to use.

Value

The following items will be returned and saved in the salmon directory:

  1. quant_new.sf: plain-text, tab-separated quantification file that contains 5 column: Name,Length,EffectiveLength,TPM, and NumReads.

  2. cmd_info.json: A JSON format file that records the main command line parameters with which Salmon was invoked for the run that produced the output in this directory.

  3. aux_info: This directory will have a number of files (and subfolders) depending on how salmon was invoked.

  4. meta_info.json: A JSON file that contains meta information about the run, including stats such as the number of observed and mapped fragments, details of the bias modeling etc.

  5. ambig_info.tsv: This file contains information about the number of uniquely-mapping reads as well as the total number of ambiguously-mapping reads for each transcript.

  6. lib_format_counts.json: This JSON file reports the number of fragments that had at least one mapping compatible with the designated library format, as well as the number that didn't.

  7. libParams: The auxiliary directory will contain a text file called flenDist.txt. This file contains an approximation of the observed fragment length distribution.

Details

run_salmon We use default options of Salmon. This function works for a single sample. You can use this function in a loop for multiple samples. For other options from Salmon use 'other_opts'.

References

Rob Patro, Geet Duggal, Michael I. Love, Rafael A. Irizarry, and Carl Kingsford (2017): Salmon provides fast and bias-aware quantification of transcript expression. Nature methods, 14(4), 417. https://www.nature.com/articles/nmeth.4197

Examples

Run this code
# NOT RUN {
#You will have to build index first to run this function
fastq_dir=system.file("extdata","", package="GREP2")
# }
# NOT RUN {
build_index(species="human",kmer=31,ens_release=92,
destdir=tempdir())
run_salmon(srr_id="SRR5890521",library_layout="SINGLE",
index_dir=tempdir(),destdir=tempdir(),
fastq_dir=fastq_dir,use_trimmed_fastq=FALSE,
other_opts=NULL,n_thread=2)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab