read.SnpSetIllumina(samplesheet, manifestpath=NULL, reportpath=NULL, rawdatapath=NULL, reportfile=NULL, briefOPAinfo=TRUE, readTIF=FALSE, nochecks=FALSE, sepreport="\t", essentialOnly=FALSE, ...)
TRUE
then only the SNP name, Illumi
code, chromosome and basepair position are put into the featureData
slot of the result, else all information from the OPA file is put into the
featureData
slotbeadarray
package and raw TIF files to
read dataTRUE
then only the essential
columns from a reportfile are included into the result. See detailsreadIllumina
and can be used to perform bead-level normalizationSnpSetIllumina
object, or a MultiSet
object
when nochecks
is TRUE
.
Sample_Name
,
Sentrix_Position
, and Pool_ID
. The values in
columns Sample_Plate
, Pool_ID
, and
Sentrix_ID
should be the same for all samples in the file, as
this is the case for processed experiments. The contents of the sample sheet
are put into the phenoData
slot.
Zero values in the raw data signals are set to NA
Ideally the OPA manifest file containing SNP annotation should be available,
these files are provided by Illumina. Columns IllCode
,
CHR
, and MapInfo
are put into the
featureData
slot.
GenCall Data
In order to process experiments that were genotyped using the GenCall software,
the arrays should be scanned with the setting
true
in the Illumina configuration file
Settings.XML
. 3 Types of files need to be present in the same folder:
The sample sheet, .csv files containing signal intensity data, and the report
file that contains the genotype information. For each sample in the sample
sheet there should be a .csv file with the following file mask:
[sam_id]_R00[yy]_C00[xx].csv
, where sam_id
is the Illumina ID
for the SAM, and xx
and yy
are the column and row number
respectively. From the report files the file with mask
[Pool_ID]_LocusByDNA[_ExpName].csv
is used. Pool_ID
is
the OPA panel used, and _ExpName
is optional.
BeadStudio Data
To process experiments that were processed with BeadStudio, only two files are
needed. The sample sheet and the Final Report file. The sample sheet must
contain the same columns as for GenCall, the report file should contain the
following columns: SNP Name
, Sample ID
,
GC Score
, Allele1 - AB
,
Allele2 - AB
, GT Score
, X Raw
,
and Y Raw
. SNP Name
and
Sample ID
are used to form rows and columns in the
experimental data, GC Score
is put in the
callProbability
matrix, Allele1 - AB
and
Allele2 - AB
are combined into the call
matrix,
GT Score
is added to the featureData
slot,
X Raw
is put in the R
matrix and Y Raw
in the G
matrix. Other columns in the report file are added as matrices
in the assayData
slot, or columns in the featureData
slot if
values are identical for all samples in the reportfile.
When nochecks
is TRUE
then only the SNP Name
and
Sample ID
columns are required. The resulting object is now of
class MultiSet
Sample sheets
To help generate a sample sheet for BeadStudio data a Sample_Map.txt
file can be converted to a sample sheet with the
Sample_Map2Samplesheet
function. For Beadstudio reportfiles it is
also possible to set samplesheet=NULL
. In this case the phenoData
slot will be fabricated from the sample names in the reportfile.
Manifest/OPA/annotation files
For BeadStudio reportfiles it is not necessary to have a Manifest file if the
columns Chr
and Position
are available in the
report file. Currently this is the only way to import data from Infinium
arrays, because Illumina does not supply Manifest files for these arrays.
SnpSetIllumina-class
, Sample_Map2Samplesheet
,
readIllumina
# read a SnpSetIllumina object using example textfiles in data directory
datadir <- system.file("testdata", package="beadarraySNP")
SNPdata <- read.SnpSetIllumina(paste(datadir,"4samples_opa4.csv",sep="/"),datadir)
Run the code above in your browser using DataLab