Includes the data generator for the simulation study on cell- and case-wise contamination that appears on Agostinelli et al. (2014).
generate.randcorr(cond, p, tol=1e-5, maxits=100) generate.cellcontam(n, p, cond, contam.size, contam.prop, A=NULL)
generate.casecontam(n, p, cond, contam.size, contam.prop, A=NULL)
generate.randcorr
gives the random correlation matrix in dimension p
and with condition number cond
.
generate.cellcontam
and generate.casecontam
give the multivariate normal sample that is either cell-wise
or case-wise contaminated as described in Agostinelli et al. (2014). The contaminated sample is returned as components of a list with components
x | multivariate normal sample with cell- or case-wise contamination. |
u | n by p matrix of 0's and 1's with 1's correspond to an outlier. A row of 1's correspond to a case-wise outlier. |
A | random correlation matrix with a specified condition number. |
desired condition number of the random correlation matrix. The correlation matrix will be used to generate multivariate normal samples in generate.cellcontam
and generate.cellcontam
.
tolerance level for the condition number of the random correlation matrix. Default is 1e-5
.
integer indicating the maximum number of iterations until the condition number of the random correlation matrix is within a tolerance level. Default is 100.
integer indicating the number of observations to be generated.
integer indicating the number of variables to be generated.
size of cell- or case-wise contamination. For cell-wise outliers, random cells in a data matrix are replaced by contam.dist
.
For case-wise outliers, random cases in a data matrix are replaced by contam.dist
times \(v\) where \(v\)
proportion of cell- or case-wise contamination.
correlation matrix used for generating data. If A
is NULL
,
a random correlation matrix is generated. Default is NULL
.
Andy Leung andy.leung@stat.ubc.ca, Claudio Agostinelli, Ruben H. Zamar, Victor J. Yohai
Details about how the correlation matrix is randomly generated and how the contaminated data is generated can be found in Agostinelli et al. (2014).
Agostinelli, C., Leung, A. , Yohai, V.J., and Zamar, R.H. (2014) Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination. arXiv:1406.6031[math.ST]
TSGS