This function generates the full list of parameters required for the Generalized Mode Jumping Markov Chain Monte Carlo (GMJMCMC) algorithm, building upon the parameters from gen.params.mjmcmc
. The generated parameter list includes feature generation settings, population control parameters, and optimization controls for the search process.
gen.params.gmjmcmc(data)
A list of parameters for controlling GMJMCMC behavior:
A data frame containing the dataset with covariates and response variable.
feat$D
Maximum feature depth, default 5
. Limits the number of recursive feature transformations. For fractional polynomials, it is recommended to set D = 1
.
feat$L
Maximum number of features per model, default 15
. Increase for complex models.
feat$alpha
Strategy for generating $alpha$ parameters in non-linear projections:
"unit"
(Default) Sets all components to 1.
"deep"
Optimizes $alpha$ across all feature layers.
"random"
Samples $alpha$ from the prior for a fully Bayesian approach.
feat$pop.max
Maximum feature population size per iteration. Defaults to min(100, as.integer(1.5 * p))
, where p
is the number of covariates.
feat$keep.org
Logical flag; if TRUE
, original covariates remain in every population (default FALSE
).
feat$prel.filter
Threshold for pre-filtering covariates before the first population generation. Default 0
disables filtering.
feat$prel.select
Indices of covariates to include initially. Default NULL
includes all.
feat$keep.min
Minimum proportion of features to retain during population updates. Default 0.8
.
feat$eps
Threshold for feature inclusion probability during generation. Default 0.05
.
feat$check.col
Logical; if TRUE
(default), checks for collinearity during feature generation.
feat$max.proj.size
Maximum number of existing features used to construct a new one. Default 15
.
rescale.large
Logical flag for rescaling large data values for numerical stability. Default FALSE
.
burn_in
The burn-in period for the MJMCMC algorithm, which is set to 100 iterations by default.
mh
A list containing parameters for the regular Metropolis-Hastings (MH) kernel:
neigh.size
The size of the neighborhood for MH proposals with fixed proposal size, default set to 1.
neigh.min
The minimum neighborhood size for random proposal size, default set to 1.
neigh.max
The maximum neighborhood size for random proposal size, default set to 2.
large
A list containing parameters for the large jump kernel:
neigh.size
The size of the neighborhood for large jump proposals with fixed neighborhood size, default set to the smaller of 0.35 * p
and 35
, where \(p\) is the number of covariates.
neigh.min
The minimum neighborhood size for large jumps with random size of the neighborhood, default set to the smaller of 0.25 * p
and 25
.
neigh.max
The maximum neighborhood size for large jumps with random size of the neighborhood, default set to the smaller of 0.45 * p
and 45
.
random
A list containing a parameter for the randomization kernel:
prob
The small probability of changing the component around the mode, default set to 0.01.
sa
A list containing parameters for the simulated annealing kernel:
probs
A numeric vector of length 6 specifying the probabilities for different types of proposals in the simulated annealing algorithm.
neigh.size
The size of the neighborhood for the simulated annealing proposals, default set to 1.
neigh.min
The minimum neighborhood size, default set to 1.
neigh.max
The maximum neighborhood size, default set to 2.
t.init
The initial temperature for simulated annealing, default set to 10.
t.min
The minimum temperature for simulated annealing, default set to 0.0001.
dt
The temperature decrement factor, default set to 3.
M
The number of iterations in the simulated annealing process, default set to 12.
greedy
A list containing parameters for the greedy algorithm:
probs
A numeric vector of length 6 specifying the probabilities for different types of proposals in the greedy algorithm.
neigh.size
The size of the neighborhood for greedy algorithm proposals, set to 1.
neigh.min
The minimum neighborhood size for greedy proposals, set to 1.
neigh.max
The maximum neighborhood size for greedy proposals, set to 2.
steps
The number of steps for the greedy algorithm, set to 20.
tries
The number of tries for the greedy algorithm, set to 3.
loglik
A list to store log-likelihood values, which is by default empty.
gen.params.mjmcmc
, gmjmcmc
data <- data.frame(y = rnorm(100), x1 = rnorm(100), x2 = rnorm(100))
params <- gen.params.gmjmcmc(data)
str(params)
Run the code above in your browser using DataLab