pfc(X, y, fy = NULL, numdir = NULL, structure = c("iso", "aniso",
"unstr", "unstr2"), eps_aniso = 1e-3, numdir.test = FALSE, ...)
n
rows of observations and p
columns of predictors. The predictors are assumed to have a continuous distribution.n
observations, continuous or categorical.bf
or defined by the user. It is a function of y
alone and has r
independent column vectors. See bf
, for detail.r
and p
. By default numdir
=$\min{r,p}$.var(X|Y)
. The following options are available: "iso"
for isotropic (predictors, conditionally on the response, are independent and on the same measurement scale); "aniso"
for anisotropic (predictors, convar(X|Y)
for the anisotropic structure.FALSE
, pfc
fits with the numdir
provided only. If TRUE
, PFC models are fit for all dimensions less than or equal to numdir
.Grassmannoptim
.ldr
. The output depends on the argument numdir.test
.
If numdir.test=TRUE
, a list of matrices is provided corresponding to the numdir
values (1 through numdir
) for each of the parameters $\mu$, $\beta$, $\Gamma$, $\Gamma_0$, $\Omega$, and $\Omega_0$. Otherwise, a single list of matrices for a single value of numdir
. The outputs of loglik
, aic
, bic
, numpar
are vectors of numdir
elements if numdir.test=TRUE
, and scalars otherwise. Following are the components returned:numdir
largest eigenvalues of $\hat{\Sigma}_{\mathrm{fit}}$.iso
) with $\Delta=\delta^2 I_p$,
where, conditionally on the response, the predictors are independent and are on the same measurement scale.
The sufficient reduction is $\Gamma^TX$. The anisotropic (aniso
) PFC model assumes that
$\Delta=$diag$(\delta_1^2, ..., \delta_p^2)$, where the conditional predictors are independent and on different measurement scales.
The unstructured (unstr
) PFC model allows a general structure for $\Delta$. With the anisotropic and unstructured $\Delta$, the
sufficient reduction is $\Gamma^T \Delta^{-1}X$. it should be noted that $X \in R^{p}$ while the data-matrix to use is in $R^{n \times p}$.
The error structure of the extended structure has the following form
$$\Delta=\Gamma \Omega \Gamma^T + \Gamma_0 \Omega_0 \Gamma_0^T,$$where $\Gamma_0$ is the orthogonal completion of $\Gamma$ such that $(\Gamma, \Gamma_0)$ is a
$p \times p$ orthogonal matrix. The matrices $\Omega \in R^{d \times d}$ and $\Omega_0 \in
R^{(p-d) \times (p-d)}$ are assumed to be symmetric and full-rank. The sufficient reduction is $\Gamma^{T}X$.
Let $\mathcal{S}_{\Gamma}$ be the subspace spanned by the columns of $\Gamma$. The parameter space of $\mathcal{S}_{\Gamma}$
is the set of all $d$ dimensional subspaces in $R^p$, called Grassmann manifold
and denoted by $\mathcal{G}_{(d,p)}$.
Let $\hat{\Sigma}$, $\hat{\Sigma}_{\mathrm{fit}}$ be the sample variance of $X$ and
the fitted covariance matrix, and let $\hat{\Sigma}_{\mathrm{res}}=\hat{\Sigma} - \hat{\Sigma}_{\mathrm{fit}}$. The
MLE of $\mathcal{S}_{\Gamma}$ under unstr2
setup is obtained by maximizing the log-likelihood
$$L(\mathcal{S}_U) = - \log|U^T \hat{\Sigma}_{\mathrm{res}} U| - \log|V^T \hat{\Sigma}V|$$
over $\mathcal{G}_{(d,p)}$, where $V$ is an orthogonal completion of $U$.
The dimension $d$ of the sufficient reduction must be estimated. A sequential likelihood ratio test is implemented as well as Akaike and Bayesian information criterion following Cook and Forzani (2008)core, lad
data(bigmac)
fit1 <- pfc(X=bigmac[,-1], y=bigmac[,1], fy=bf(y=bigmac[,1], case="poly",
degree=3),numdir=3, structure="aniso")
summary(fit1)
plot(fit1)
fit2 <- pfc(X=bigmac[,-1], y=bigmac[,1], fy=bf(y=bigmac[,1], case="poly",
degree=3), numdir=3, structure="aniso", numdir.test=TRUE)
summary(fit2)
Run the code above in your browser using DataLab