This function bundles parameters controlling mainly the starting-, convergence-, boundary-,
and stopping-behaviour of the local principal curve. It will be used
only inside the lpc()
function argument.
lpc.control(iter =100, cross=TRUE,
boundary = 0.005, convergence.at = 0.00001,
mult=NULL, ms.h=NULL, ms.sub=30,
pruning.thresh=0.0, rho0=0.4)
A list of the nine specified input parameters, which can be read by the
control
argument of the lpc
function.
Maximum number of iterations on either side of the starting point within each branch.
Logical parameter. If FALSE
, a curve is stopped when it
comes too close to an another part of itself. Note: Even when
cross=FALSE
, different branches of the curve (for higher depth
or multiple starting points) are still allowed
to cross. This option only avoids crossing of each particular branch
with itself. Used in the self-coverage functions to avoid overfitting.
This boundary correction [2] reduces the bandwidth adaptively once the relative difference of parameter values between two centers of mass falls below the given threshold. This measure delays convergence and enables the curve to proceed further into the end points. If set to 0, this boundary correction is switched off.
This forces the curve to stop if the
relative difference of parameter values between two centers of mass
falls below the given threshold. If set to 0, then the curve will
always stop after exactly iter
iterations.
numerical value which enforces a fixed number of starting points. If the
number given here is larger than the number of starting points
provided at x0
, then the missing points will be set at
random (For example, if \(d=2\), mult=3
, and
x0=c(58.5, 17.8, 80,20)
, then one gets the starting points (58.5, 17.8), (80,20), and a randomly
chosen third one. Another example for such a situation is x0=NULL
with
mult=1
, in which one random starting point is chosen). If the number given here is smaller the number of starting points
provided at x0
, then only the first mult
starting
points will be used.
sets the bandwidth (vector) for the initial mean shift procedure
which finds the local density modes, and, hence, the starting points
for the LPC. If unspecified, the bandwidth h
used in
function lpc
is used here too.
proportion of data points (default=30) which are used to initialize mean shift trajectories for the mode finding. In fact, we use
min(max(ms.sub, floor(ms.sub*N/100)), 10*ms.sub)
trajectories.
Prunes branches corresponding to higher-depth starting points if their density estimate falls below this threshold. Typically, a value between 0.0 and 1.0. The setting 0.0 means no pruning.
A numerical value which steers the birth process of higher-depth starting points. Usually, between 0.3 and 0.4 (see reference [1]).
JE
[1] Einbeck, J., Tutz, G. & Evers, L. (2005): Exploring Multivariate Data Structures with Local Principal Curves. In: Weihs, C. and Gaul, W. (Eds.): Classification - The Ubiquitous Challenge. Springer, Heidelberg, pages 256-263.
[2] Einbeck, J. and Zayed, M. (2014). Some asymptotics for localized principal components and curves. Communications in Statistics - Theory and Methods 43, 1736-1749.
data(calspeedflow)
fit1 <- lpc(calspeedflow[,c(3,4)], x0=c(50,60),scaled=1,
control=lpc.control(iter=20, boundary=0))
plot(fit1, type=c("curve","start","mass"))
Run the code above in your browser using DataLab