Simulated multiple linear regression data from a model used in simulation experiments
reported in Shao's famous paper on cross-validation for model selection.
cross-covariance, must be less than in magnitude 1
sig
residual standard deviation
Value
Data frame with n rows and p+1 columns.
The first p columns are labelled x1, ..., xp and the last column is y.
Details
In general the regression equation used for simulation is:
$$y = X \beta + \epsilon$$
where
$\beta$ is a vector of the regression coefficients of length p,
X is the design matrix with n rows and p columns and
$\epsilon$ is a vector of n independent normal random variables
with mean zero and standard deviation sig.
The rows of X are p-variate normal with mean vector zero and p-by-p covariance
matrix (i,j)-entry $rho^|i-j|$.
Shao (1993) used the default settings in the arguments and n = 20, 60, 100
in simulation experiments with delete-d cross-validation.
References
Jun Shao (1993), Linear Model Selection by Cross-validation, Journal of the
American Statistical Association, 88/422.