Learn R Programming

survPen (version 2.0.1)

predSNS: Prediction of grouped indicators : population (net) survival (PNS) and age-standardized (net) survival (SNS)

Description

Allows the prediction of population and age-standardized (net) survival as well as associated confidence intervals

Usage

predSNS(
  model,
  time.points,
  newdata,
  weight.table,
  var.name,
  var.model,
  conf.int = 0.95,
  method = "exact",
  n.legendre = 50
)

Value

List of nine elements

class.table

Number of individuals in each age class

SNS

Vector of predicted age-standardized (net) survival

SNS.inf

Lower bound of confidence intervals associated with predicted age-standardized (net) survival

SNS.sup

Upper bound of confidence intervals associated with predicted age-standardized (net) survival

PNS

Vector of predicted population (net) survival

PNS.inf

Lower bound of confidence intervals associated with predicted population (net) survival

PNS.sup

Upper bound of confidence intervals associated with predicted population (net) survival

PNS_per_class

matrix of predicted population (net) survival in each age class

PNS_per_class.inf

Lower bound of confidence intervals associated with predicted population (net) survival in each age class

PNS_per_class.sup

Upper bound of confidence intervals associated with predicted population (net) survival in each age class

Arguments

model

a fitted survPen model

time.points

vector of follow-up values

newdata

dataset containing the original age values used for fitting

weight.table

dataset containing the age classes used for standardization, must be in the same format as the elements of the following list list.wicss

var.name

list containing one element : the column name in newdata that reports age values. This element should be named after the age variable present in the model formula. Typically, if newdata contains an 'age' column while the model uses a centered age 'agec', the list should be: list(agec="age")

var.model

list containing one element : the function that allows retrieving the age variable used in model formula from original age. Typically for age centered on 50, list(agec=function(age) age - 50)

conf.int

numeric value giving the precision of the confidence intervals; default is 0.95

method

should be either 'exact' or 'approx'. The 'exact' method uses all age values in newdata for predictions. The 'approx' method uses either newdata$age (if age values are whole numbers) or floor(newdata$age) + 0.5 (if age values are not whole numbers) and then removes duplicates to reduce computational cost.

n.legendre

number of nodes to approximate the cumulative hazard by Gauss-Legendre quadrature; default is 50

Population Net Survival (PNS)

For a given group of individuals, PNS at time t is defined as $$PNS(t)=\sum_i 1/n*S_i(t,a_i)$$ where \(a_i\) is the age of individual \(i\)

Standardized Net Survival (SNS)

SNS at time t is defined as $$SNS(t)=\sum_i w_i*S_i(t,a_i)$$ where \(a_i\) is the age of individual \(i\) and \(w_i=w_{ref j(i)}/n_{j(i)}\). \(w_{ref j(i)}\) is the weigth of age class \(j\) in the reference population (it corresponds to weight.table$AgeWeights). Where \(n_{j(i)}\) is the total number of individuals present in age class \(j(i)\): the age class of individual \(i\).

Standardized Net Survival (SNS) with method="approx"

For large datasets, SNS calculation is quite heavy. To reduce computational cost, the idea is to regroup individuals who have similar age values. By using floor(age) + 0.5 instead of age, the gain will be substantial while the prediction error will be minimal (method="approx" will give slightly different predictions compared to method="exact"). Of course, if the provided age values are whole numbers then said provided age values will be used directly for grouping and there will be no prediction error (method="approx" and method="exact" will give the exact same predictions). $$SNS(t)=\sum_a \tilde{w}_a*S(t,a)$$ The sum is here calculated over all possible values of age instead of all individuals. We have \(\tilde{w}_a=n_a*w_{ref j(a)}/n_{j(a)}\). Where \(j(a)\) is the age class of age \(a\) while \(n_a\) is the number of individuals with age \(a\).

Variance and Confidence Intervals

Confidence intervals for SNS are derived assuming normality of log(log(-SNS)) Lower and upper bound are given by $$IC_{95\%}(SNS)=[SNS^{1.96*\sqrt(Var(Log(Delta_{SNS})))};SNS^{-1.96*\sqrt(Var(Log(Delta_{SNS})))}]$$ with $$Delta_{SNS}=-log(SNS)$$ \(Var(Log(Delta_{SNS}))\) is derived by Delta method.

Confidence intervals for PNS are derived in the exact same way.

Details

The weight table used should always be in the same format as elements of list.wicss. Only age-standardization is possible for now. All other variables necessary for model predictions should be fixed to a single value. For simplicity, in what follows we will consider that survival only depends on time and age.

References

Corazziari, I., Quinn, M., & Capocaccia, R. (2004). Standard cancer patient population for age standardising survival ratios. European journal of cancer (Oxford, England : 1990), 40(15), 2307–2316. https://doi.org/10.1016/j.ejca.2004.07.002.

Examples

Run this code

data(datCancer)
data(list.wicss)

don <- datCancer
don$agec <- don$age - 50 # using centered age for modelling

#-------------------- model with time and age

knots.t<-quantile(don$fu[don$dead==1],probs=seq(0,1,length=6)) # knots for time
knots.agec<-quantile(don$agec[don$dead==1],probs=seq(0,1,length=5))   # knots for age

formula <- as.formula(~tensor(fu,agec,df=c(length(knots.t),length(knots.agec)),
knots=list(fu=knots.t,age=knots.agec)))

mod <- survPen(formula,data=don,t1=fu,event=dead,n.legendre=20, expected=rate)


#-------------------- Age classes and associated weights for age-standardized 
# net survival prediction
		
# weights of type 1					
wicss <- list.wicss[["1"]]					
				
# to estimate population net survival, prediction dataframe
# is needed. It should contain original data for age 

pred.pop <- data.frame(age=don$age)

#-------------------- prediction : age-standardized net survival and population net survival

pred <- predSNS(mod,time.points=seq(0,5,by=0.1),newdata=pred.pop,
weight.table=wicss,var.name=list(agec="age"),
var.model=list(agec=function(age) age - 50),method="approx")



Run the code above in your browser using DataLab