bf: Function to generate a basis function.

Description

This function is to construct a data-matrix of basis function using the n response observations. The response can be continuous or categorical. The function returns a matrix of n rows and r columns. The number of columns r depends on the choice of basis function. Polynomial, piecewise polynomial continuous and discontinuous, and Fourier bases are implemented. For a polynomial basis, r is the degree of the polynomial.

Usage

bf(y, case = c("poly", "categ", "fourier", "pcont", "pdisc"),
degree = 1, nslices = 1, scale = FALSE)

Arguments

A response vector of n observations.

case

Take values "poly" for polynomial, "categ" for categorical, "fourier" for Fourier, "pcont" for piecewise continuous, and "pdisc" for piecewise discontinuous bases.

degree

For polynomial and piecewise polynomial bases, degree is the degree of the polynomial. With "pdisc", degree=0 corresponds to piecewise constant.

nslices

The number of slices for piecewise bases only. The range of the response is partitioned into nslices parts with roughly equal numbers of observations. See details on piecewise bases for more information.

scale

If TRUE, the columns of the basis function are scaled to have unit variance.

Value

fy: A matrix with n rows and r columns.
scale: Boolean. If TRUE, the columns of the output are standardized to have unit variance.

Details

The basis function $f_y$ is a vector-valued function of the response $y \in R$. There is an infinite number of basis functions, including the polynomial, piecewise polynomial, and Fourier. We implemented the following:

1. Polynomial basis: $f_y=(y, y^2, ..., y^r)^T$. It corresponds to the "poly" argument of bf. The argument degree is $r$ of the polynomial is provided by the user. The subsequent $n \times r$ data-matrix is column-wise centered.

2. Piecewise constant basis: It corresponds to pdisc with degree=0. It is obtained by first slicing the range of $y$ into $h$ slices $H_1,...,H_k$. The $k^{th}$ component of $f_y \in \mathrm{R}^{h-1}$ is $f_{y_k}=J(y \in H_k)-n_k/n, k=1, ..., h-1$, where $n_y$ is the number of observations in $H_k$, and $J$ is the indicator function. We suggest using between two and fifteen slices without exceeding $n/5$.

3. Piecewise discontinuous linear basis: It corresponds to "pdisc" with degree=1. It is more elaborate than the piecewise constant basis. A linear function of $y$ is fit within each slice. Let $\tau_i$ be the knots, or endpoints of the slices. The components of $f_y \in \mathrm{R}^{2h-1}$ are obtained with $f_{y_{(2i-1)}} = J(y \in H_i)$; $f_{y_{2i}} = J(y \in H_i)(y-\tau_{i-1})$ for $i=1,2,...,h-1$ and $f_{y_{(2h-1)}} = J(y \in H_{h})(y-\tau_{h-1})$. The subsequent $n \times (2h-1)$ data-matrix is column-wise centered. We suggest using fewer than fifteen slices without exceeding $n/5$. 4. Piecewise continuous linear basis: The general form of the components $f_{y_i}$ of $f_y \in \mathrm{R}^{h+1}$ is given by $f_{y_1} = J(y \in H_1)$ and $f_{y_{i+1}} = J(y \in H_{i})(y-\tau_{i-1})$ for $i=1,...,h.$. The subsequent $n \times (h-1)$ data-matrix is column-wise centered. This case corresponds to "pcont" with degree=1. The number of slices to use may not exceed $n/5$.

5. Fourier bases: They consist of a series of pairs of sines and cosines of increasing frequency. A Fourier basis is given by $f_y=(\cos(2\pi y), \sin(2\pi y),..., \cos(2\pi ky), \sin(2\pi ky))^T.$ The subsequent $n \times 2k$ data-matrix is column-wise centered.

6. Categorical basis: It is obtained using "categ" option when $y$ takes $h$ distinct values $1, 2,..., h$, corresponding to the number of sub-populations or sub-groups. The number of slices is naturally $h$. The expression for the basis is identical to piecewise constant basis.

In all cases, the basis must be constructed such that $F^TF$ is invertible, where $F$ is the $n \times r$ data-matrix with its $i$th row being $f_y$.

References

Adragni, KP (2009) PhD Dissertation, University of Minnesota.

Adragni, KP and Cook, RD (2009): Sufficient dimension reduction and prediction in regression. Phil. Trans. R. Soc. A 367, 4385-4405.

Cook, RD (2007): Fisher Lecture - Dimension Reduction in Regression (with discussion). Statistical Science, Vol. 22, 1--26.

Examples

Run this code

data(bigmac)

# Piecewise constant basis with 5 slices
fy=bf(y=bigmac[,1], case="pdisc", degree=0, nslices=5)
fit1 <- pfc(X=bigmac[,-1], y=bigmac[,1], fy=fy, numdir=3, structure="aniso")
summary(fit1)

# Cubic polynomial basis
fy=bf(y=bigmac[,1], case="poly", degree=3)
fit2 <- pfc(X=bigmac[,-1], y=bigmac[,1], fy=fy, numdir=3, structure="aniso")
summary(fit2)

# Piecewise linear continuous with 3 slices
fy=bf(y=bigmac[,1], case="pcont", degree=1, nslices=3)
fit3 <- pfc(X=bigmac[,-1], y=bigmac[,1], fy=fy, numdir=3, structure="unstr")
summary(fit3)

Run the code above in your browser using DataLab