xxt(snps, strata = NULL, correct.for.missing = FALSE, lower.only = FALSE, uncertain = FALSE)
"SnpMatrix"
factor
(or an object which can be coerced into
a factor
) with length equal to the number of rows of
snps
defining stratum membershipTRUE
, an attempt is made to
correct for the effect of missing data by use of inverse probability
weights. Otherwise, missing observations are scored zero in the
standardized matrixTRUE
, only the lower triangle of the
result is returned and the upper triangle is filled with
zeros. Otherwise, the complete symmetric matrix is returnedTRUE
, uncertain genotypes are replaced by
posterior expectations. Otherwise these are treated as missing
valuesSnpMatrix
whose columns have been standardized to zero mean and
unit variance. For autosomes, the genotypes are given codes 0, 1 or 2
after subtraction of the mean, 2p, are divided by the standard
deviation
sqrt(2p(1-p)) (p is the estimated allele
frequency). For SNPs on the X chromosome in male subjects,
genotypes are coded 0 or 2. Then
the mean is still 2p, but the standard deviation is
2sqrt(p(1-p)). If the strata
is supplied, a
stratum-specific estimate value for p is used for
standardization.
Missing observations present some difficulty. Price et al. (2006)
recommended replacing missing observations by their means, this being
equivalent to replacement by zeros in the standardized matrix. However
this results in a biased estimate of the complete data
result. Optionally this bias can be corrected by inverse probability
weighting. We assume that the probability that any one call is missing
is small, and can be predicted by a multiplicative model with row
(subject) and column (locus) effects. The estimated probability of a
missing value in a given row and column is then given by
$m = RC/T$, where R is the row total number of
no-calls, C is the column total of no-calls, and T is the
overall total number of no-calls. Non-missing contributions to
X.X-transpose are then weighted by $w=1/(1-m)$ for
contributions to the diagonal elements, and products of the relevant
pairs of weights for contributions to off--diagonal elements.
# make a SnpMatrix with a small number of rows
data(testdata)
small <- Autosomes[1:100,]
# Calculate the X.X-transpose matrix
xx <- xxt(small, correct.for.missing=TRUE)
# Calculate the principal components
pc <- eigen(xx, symmetric=TRUE)$vectors
Run the code above in your browser using DataLab