Usage
readVariantFiles(fileDir, sepSymbol = "_", fileID = "*_variants.txt", firstColName = "SEQ_ID", fileSep = "\t", idCols = 5, refPosCol = "Reference.Position", colToSort = "Coverage", removeDups = TRUE, returnMerged = TRUE, returnSing = FALSE, limitGenes = NULL, omitRefMatches = TRUE, refAlleleCol = "Reference$", varAlleleCol = "Allele")
Arguments
fileDir
The path to the directory containing all of the variant files.
sepSymbol
The symbol that separates the sample names from other info in
the file name.
Used to pull names for columns in the combined file.
Set to "" if the full file name should be used.
fileID
character to use to limit which files are imported; regular expressions allowed
firstColName
What should the first column be renamed to.
Set to NULL or "" to leave the column as is.
Intended to stanardize and to match the column names in other
parts of the analysis pipeline.
fileSep
The column delimiter used in the file (e.g. "," or "\t")
idCols
How many columns of position information are there?
Avoids including duplicated information in the combined ouput.
refPosCol
Which column has the reference position?
Can be numeric or character
colToSort
Which column should be used to keep one line per position,
if removeDups == TRUE
?
Can be numeric or character.
removeDups
Logical, should duplicates at a position be removed?
This is necessary to avoid massive over merging
returnMerged
Logical, should the merged variants be returned?
returnSing
Logical, should each of the separate variant files be returned?
limitGenes
A character vector listing the genes to include.
This can be useful if your variant files include genes
that you are not interested in analyzing
(e.g. things without a blast hit).
omitRefMatches
Logical, should 'variants' which match the reference be excluded?
This is useful if your variant file includes rows for reads
aligning to the reference allele,
which may be accidentally set as the main 'variant' in this function.
Defaults to TRUE.
refAlleleCol
Which column has the reference allele?
Can be numeric or character.
varAlleleCol
Which column has the variable alleles?
Can be numeric or character.