haplo.surv.discrete: Discrete time to event haplo type analysis

Description

Can be used for logistic regression when time variable is "1" for all id.

Usage

haplo.surv.discrete(
  X = NULL,
  y = "y",
  time.name = "time",
  Haplos = NULL,
  id = "id",
  desnames = NULL,
  designfunc = NULL,
  beta = NULL,
  no.opt = FALSE,
  method = "NR",
  stderr = TRUE,
  designMatrix = NULL,
  response = NULL,
  idhap = NULL,
  design.only = FALSE,
  covnames = NULL,
  fam = binomial,
  weights = NULL,
  offsets = NULL,
  idhapweights = NULL,
  ...
)

Arguments

X: design matrix data-frame (sorted after id and time variable) with id time response and desnames
y: name of response (binary response with logistic link) from X
time.name: to sort after time for X
Haplos: (data.frame with id, haplo1, haplo2 (haplotypes (h)) and p=P(h|G)) haplotypes given as factor.
id: name of id variale from X
desnames: names for design matrix
designfunc: function that computes design given haplotypes h=(h1,h2) x(h)
beta: starting values
no.opt: optimization TRUE/FALSE
method: NR, nlm
stderr: to return only estimate
designMatrix: gives response and designMatrix directly not implemented (mush contain: p, id, idhap)
response: gives response and design directly designMatrix not implemented
idhap: name of id-hap variable to specify different haplotypes for different id
design.only: to return only design matrices for haplo-type analyses.
covnames: names of covariates to extract from object for regression
fam: family of models, now binomial default and only option
weights: weights following id for GLM
offsets: following id for GLM
idhapweights: weights following id-hap for GLM (WIP)
...: Additional arguments to lower level funtions lava::NR optimizer or nlm

Author

Thomas Scheike

Details

Cycle-specific logistic regression of haplo-type effects with known haplo-type probabilities. Given observed genotype G and unobserved haplotypes H we here mix out over the possible haplotypes using that P(H|G) is provided.

$$ S(t|x,G)) = E( S(t|x,H) | G) = \sum_{h \in G} P(h|G) S(t|z,h) $$ so survival can be computed by mixing out over possible h given g.

Survival is based on logistic regression for the discrete hazard function of the form $$ logit(P(T=t| T \geq t, x,h)) = \alpha_t + x(h) \beta $$ where x(h) is a regression design of x and haplotypes $h=(h_1,h_2)$

Likelihood is maximized and standard errors assumes that P(H|G) is known.

The design over the possible haplotypes is constructed by merging X with Haplos and can be viewed by design.only=TRUE

Examples

Run this code

## some haplotypes of interest
types <- c("DCGCGCTCACG","DTCCGCTGACG","ITCAGTTGACG","ITCCGCTGAGG")

## some haplotypes frequencies for simulations 
data(hapfreqs)

www <-which(hapfreqs$haplotype %in% types)
hapfreqs$freq[www]

baseline=hapfreqs$haplotype[9]
baseline

designftypes <- function(x,sm=0) {# {{{
hap1=x[1]
hap2=x[2]
if (sm==0) y <- 1*( (hap1==types) | (hap2==types))
if (sm==1) y <- 1*(hap1==types) + 1*(hap2==types)
return(y)
}# }}}

tcoef=c(-1.93110204,-0.47531630,-0.04118204,-1.57872602,-0.22176426,-0.13836416,
0.88830288,0.60756224,0.39802821,0.32706859)

data(hHaplos)
data(haploX)

haploX$time <- haploX$times
Xdes <- model.matrix(~factor(time),haploX)
colnames(Xdes) <- paste("X",1:ncol(Xdes),sep="")
X <- dkeep(haploX,~id+y+time)
X <- cbind(X,Xdes)
Haplos <- dkeep(ghaplos,~id+"haplo*"+p)
desnames=paste("X",1:6,sep="")   # six X's related to 6 cycles 
out <- haplo.surv.discrete(X=X,y="y",time.name="time",
         Haplos=Haplos,desnames=desnames,designfunc=designftypes) 
names(out$coef) <- c(desnames,types)
out$coef
summary(out)

Run the code above in your browser using DataLab