read.maimages(files=NULL, source="generic", path=NULL, ext=NULL, names=NULL, columns=NULL, other.columns=NULL, annotation=NULL, green.only=FALSE, wt.fun=NULL, verbose=TRUE, sep="\t", quote=NULL, ...)
read.imagene(files, path=NULL, ext=NULL, names=NULL, columns=NULL, other.columns=NULL, wt.fun=NULL, verbose=TRUE, sep="\t", quote="\"", ...)
FileName
.
If omitted, then all files with extension ext
in the specified directory will be read in alphabetical order."generic"
, "agilent"
, "agilent.median"
, "agilent.mean"
, "arrayvision"
, "arrayvision.ARM"
, "arrayvision.MTM"
, "bluefuse"
, "genepix"
, "genepix.custom"
, "genepix.median"
, "imagene"
, "imagene9"
, "quantarray"
, "scanarrayexpress"
, "smd.old"
, "smd"
, "spot"
or "spot.close.open"
.files$Label
if files
is a data.frame.
Defaults to removeExt(files)
.R
, G
, Rb
and Gb
giving the column names to be used for red and green foreground and background or, in the case of Imagene data, a list with fields f
and b
.
For single channel data, the fields are usually E
and Eb
.
This argument is optional if source
is specified, otherwise it is required.source
, should the green (Cy3) channel only be read, or are both red and green required?TRUE
to report each time a file is readread.table
EListRaw
object.
For two-color data, an RGList
object containing the components
wt.fun
is givenother.columns
if givensource
is "agilent"
, "genepix"
or source="imagene"
or if the annotation
argument is setFileName
giving the names of the files read. If files
was a data.frame on input, then the whole data.frame is stored here on output.PrintLayout
, currently set only if source="imagene"
read.maimages
reads either single channel or two-color microarray intensity data from text files.
read.imagene
is specifically for two-color ImaGene intensity data created by ImaGene versions 1 through 8, and is called by read.maimages
to read such data.read.maimages
is designed to read data from any microarray platform except for Illumina BeadChips, which are read by read.ilmn
, and Affymetrix GeneChip data, which is best read and pre-processed by specialist packages designed for that platform.
read.maimages
extracts the foreground and background intensities from a series of files, produced by an image analysis program, and assembles them into the components of one list.
The image analysis programs Agilent Feature Extraction, ArrayVision, BlueFuse, GenePix, ImaGene, QuantArray (Version 3 or later), Stanford Microarray Database (SMD) and SPOT are supported explicitly.
Almost all these programs write the intensity data for each microarray to one file.
The exception is ImaGene, early versions of which wrote the red and green channels of each microarray to different files.
Data from some other image analysis programs not mentioned above can be read if the appropriate column names containing the foreground and background intensities are specified using the columns
argument.
(Reading custom columns will work provided the column names are unique and there are no rows in the file after the last line of data.
Header lines are ok.)
For Agilent files, two possible foreground estimators are supported: source="agilent.median"
use median foreground while source="agilent.mean"
uses mean foreground.
Background estimates are always medians.
The use of source="agilent"
defaults to "agilent.median"
.
Note that this behavior is new from 9 March 2012.
Previously, in limma 3.11.16 or earlier, "agilent"
had the same meaning as "agilent.mean"
.
For GenePix files, two possible foreground estimators are supported as well as custom background: source="genepix.median"
uses the median foreground estimates while source="genepix.mean"
uses mean foreground estimates.
The use of source="genepix"
defaults to "genepix.mean"
.
Background estimates are always medians unless source="genepix.custom"
is specified.
GenePix 6.0 and later supply some custom background options, notably morphological background.
If the GPR files have been written using a custom background, then source="genepix.custom"
will cause it to be read and used.
For SPOT files, two possible background estimators are supported:
source="spot"
uses background intensities estimated from the morphological opening algorithm.
If source="spot.close.open"
then background intensities are estimated from morphological closing followed by opening.
ArrayVision reports spot intensities in a number of different ways.
read.maimages
caters for ArrayVision's Artifact-removed (ARM) density values using source="arrayvision.ARM"
or for
Median-based Trimmed Mean (MTM) density values with "arrayvision.MTM"
.
ArrayVision users may find it useful to read the top two lines of their data file to check which version of density values they have.
SMD data should consist of raw data files from the database, in tab-delimited text form.
There are two possible sets of column names depending on whether the data was entered into the database before or after September 2003.
source="smd.old"
indicates that column headings in use prior to September 2003 should be used.
Intensity data from ImaGene versions 1 to 8 (source="imagene"
) is different from other image analysis programs in that the read and green channels were written to separate files.
read.maimages
handles the special behaviour of the early ImaGene versions by requiring that the argument files
should be a matrix with two columns instead of a vector.
The first column should contain the names of the files containing green channel (cy3) data and the second column should contain names of files containing red channel (cy5) data.
Alternately, files
can be entered as a vector of even length instead of a matrix.
In that case, each consecutive pair of file names is assumed to contain the green (cy3) and red (cy5) intensities respectively from the same array.
The function read.imagene
is called by read.maimages
when source="imagene"
, so read.imagene
does not need to be called directly by users.
ImaGene version~9 (source="imagene9"
) reverts to the same behavior as the other image analysis programs.
For ImaGene~9, files
is a vector of length equal to the number of microarrays, same as for other image analysis programs.
Spot quality weights may be extracted from the image analysis files using a weight function wt.fun.
wt.fun
may be any user-supplied function which accepts a data.frame argument and returns a vector of non-negative weights.
The columns of the data.frame are as in the image analysis output files.
There is one restriction, which is that the column names should be refered to in full form in the weight function, i.e., do not rely on name expansion for partial matches when refering to the names of the columns.
See QualityWeights
for suggested weight functions.
The argument other.columns
allows arbitrary columns of the image analysis output files to be preserved in the data object.
These become matrices in the component other
component.
For ImaGene data, the other column headings should be prefixed with "R "
or "G "
as appropriate.
Web pages for the image analysis software packages mentioned here are listed at http://www.statsci.org/micrarra/image.html
read.maimages
uses read.columns
for efficient reading of text files.
As far as possible, it is has similar behavior to read.table
in the base package.read.ilmn
reads probe or gene summary profile files from Illumina BeadChips.
An overview of LIMMA functions for reading data is given in 03.ReadingData.
# Read all .gpr files from current working directory
# and give weight 0.1 to spots with negative flags
## Not run: files <- dir(pattern="*\\.gpr$")
# RG <- read.maimages(files,"genepix",wt.fun=wtflags(0.1))## End(Not run)
# Read all .spot files from current working director and down-weight
# spots smaller or larger than 150 pixels
## Not run: files <- dir(pattern="*\\.spot$")
# RG <- read.maimages(files,"spot",wt.fun=wtarea(150))## End(Not run)
Run the code above in your browser using DataLab