Learn R Programming

aroma.apd (version 0.7.0)

readApd: Reads an Affymetrix probe data (APD) file

Description

Reads an Affymetrix probe data (APD) file.

Usage

# S3 method for default
readApd(filename, indices=NULL, readMap="byMapType", name=NULL, ..., verbose=FALSE,
  .checkArgs=TRUE)

Value

A named list with the two elements header and

data. The header is in turn a list structure and the second is a numeric

vector holding the queried data.

Arguments

filename

The filename of the APD file.

indices

An optional numeric vector of cell (probe) indices specifying what cells to read. If NULL, all are read.

readMap

A vector remapping cell indices to file indices. If "byMapType", the read map of type according to APD header will be search for and read. It is much faster to specify the read map explicitly compared with searching for it each time. If NULL, no map is used.

name

The name of the data field. If NULL, the APD header name is used. If not specified, it defaults to "intensities".

...

Not used.

verbose

See Verbose.

.checkArgs

If TRUE, arguments are validated, otherwise not.

Remapping indices

Argument readMap can be used to remap indices. For instance, the indices of the probes can be reorder such that the probes within a probeset is in a contiguous set of probe indices. Then, given that the values are stored in such an order, when reading complete probesets, data will be access much faster from file than if the values were scatter all over the file.

Example of speed improvements. Reading all 40000 values in units 1001 to 2000 of an Affymetrix Mapping 100K Xba chip is more than 10-30 times faster with mapping compared to without.

File format

The file format of an APD file is identical to the file format of an FileVector.

Author

Henrik Bengtsson

Details

To read one large contiguous block of elements is faster than to read individual elements one by one. For this reason, internally more elements than requested may be read and therefore allocation more memory than necessary. This means, in worst case \(N\) elements may read allocation \(N*8\) bytes of R memory, although only two elements are queried. However, to date even with the largest arrays from Affymetrix this will still only require tens of megabytes of temporary memory. For instance, Affymetrix Mapping 100K arrays holds 2,560,000 probes requiring 20Mb of temporary memory.

See Also

createApd() and updateApd(). See also readApdHeader(). To create a cell-index read map from an CDF file, see readCdfUnitsWriteMap.