predict.mfp2: Predict Method for `mfp2` Fits

Description

Obtains predictions from an mfp2 object.

Usage

# S3 method for mfp2
predict(
  object,
  newdata = NULL,
  type = NULL,
  terms = NULL,
  terms_seq = c("equidistant", "data"),
  alpha = 0.05,
  ref = NULL,
  strata = NULL,
  newoffset = NULL,
  ...
)

Value

For any type other than "terms" the output conforms to the output of predict.glm() or predict.coxph().

If type = "terms" or type = "contrasts", then a named list with entries for each variable requested in terms (excluding those not present in the final model). Each entry is a data.frame with the following columns:

variable: variable values on original scale.
variable_pre: variable with pre-transformation applied, i.e. shifted, scaled and centered as required.
value: partial linear predictor or contrast (depending on type).
se: standard error of partial linear predictor or contrast.
lower: lower limit of confidence interval.
upper: upper limit of confidence interval.

Arguments

object: a fitted object of class mfp2.
newdata: optionally, a matrix with column names in which to look for variables with which to predict. See mfp2() for details.
type: the type of prediction required. The default is on the scale of the linear predictors. See predict.glm() or predict.coxph() for details. In case type = "terms", see the Section on Terms prediction. In case type = "contrasts", see the Section on Contrasts.
terms: a character vector of variable names specifying for which variables term or contrast predictions are desired. Only used in case type = "terms" or type = "contrasts". If NULL (the default) then all selected variables in the final model will be used. In any case, only variables used in the final model are used, even if more variable names are passed.
terms_seq: a character string specifying how the range of variable values for term predictions are handled. The default equidistant computes the range of the data range and generates an equidistant sequence of 100 points from the minimum to the maximum values to properly show the functional form estimated in the final model. The option data uses the observed data values directly, but these may not adequately reflect the functional form of the data, especially when extreme values or influential points are present.
alpha: significance level used for computing confidence intervals in terms prediction.
ref: a named list of reference values used when type = "contrasts". Note that any variable requested in terms, but not having an entry in this list (or if the entry is NULL) then the mean value (or minimum for binary variables) will be used as reference. Values are specified on the original scale of the variable since the program will internally scale it using the scaling factors obtained from find_scale_factor(). By default, this function uses the means (for continuous variables) and minima (for binary variables) as reference values.
strata: stratum levels used for predictions.
newoffset: A vector of offsets used for predictions. This parameter is important when newdata is supplied. The offsets will be directly added to the linear predictor without any transformations.
...: further arguments passed to predict.glm() or predict.coxph().

Terms prediction

This function allows to compute the partial linear predictors for each variable selected into the final model if type = "terms". Note that the results returned from this function are different from those of predict.glm() and predict.coxph() since these functions do not take into account that a single variable can be represented by multiple terms. This functionality is useful to assess model fit, since it also allows to draw data points based on residuals.

Contrasts

This functions allows to compute contrasts with reference to a specified variable value if type = "contrasts". In this case, the fitted partial predictors will be centered at the reference value (i.e. 0), and also confidence intervals will have width 0 at that point.

Details

To prepare the newdata for prediction, this function applies any necessary shifting and scaling based on the factors obtained from the training data. It is important to note that if the shifting factors are not sufficiently large as estimated from the training data, variables in newdata may end up with negative values, which can cause prediction errors if non-linear functional forms are used. A warning is given in this case by the function. The next step involves transforming the data using the selected fractional polynomial (FP) powers. If necessary, centering of variables is conducted. Once the transformation (and centering) is complete, the transformed data is passed to either predict.glm() or predict.coxph(), depending on the chosen family of models and when type is not terms and contrasts.

Examples

Run this code


# Gaussian model
data("prostate")
x = as.matrix(prostate[,2:8])
y = as.numeric(prostate$lpsa)
# default interface
fit1 = mfp2(x, y, verbose = FALSE)
predict(fit1) # make predictions

Run the code above in your browser using DataLab