Internal function to set up subsampling distribution to execute the stochastic version of a stagewise approach. The subsampling is coducted at the cluster level, not the individual observation level. Sampling probabilities are first calculated or provided for each observation individually, and then the sampling probability for each cluster is taken to be the average probability across all observations in the cluster.
samplingDistCalculation(sampleProb, y, x, clusterID, waves, beta, beta0, phi,
alpha, offset, meanLinkInv, varianceLink, corstr, mu.eta)
A user provided value for the probability associated
with each observation. sampleProb
can be provided as 1) a vector of
fixed values of length equal to the resposne vector y, 2) a function
that takes in a list of values (full list of values given in details)
and returns a vector of length equal to the response vector y, or 3) the
default value of NULL
, which results in a uniform distribution
The vector of the response values provided to the original stagewise function
The covariate matrix provided to the original stagewise function
The vector of cluster ID numbers provided to the original stagewise function
The waves parameter identifying the order of observations within the clusters that is provided to the original stagewise function
The vector of the current estimates of the coefficients
The current estimate of the intercept
Current estimate of the scale parameter
Current estimate of the parameter affecting the within cluster correlation
offset in the linear predictor provided to the original stagewise function
The link inverse function from the family
object provided to the original stagewise function indicating what family
of mean and variance structure is assumed
The variance link function from the family
object provided to the original stagewise function indicating what family
of mean and variance structure is assumed
The structure of the working correlation matrix that was provided to the original stagewise function
Derivative function of mu, the conditional mean of the
response, with respect to eta, the linear predictor, from the family
object provided to the original stagewise function indicating what family
of mean and variance structure is assumed
The sampling distribution probabilities to be used for the sub sampling. distribution is provided as a vector with length equal to the number of clusters.