Fits regression models with stepmented (i.e. piecewise-constant) relationships between the response and one or more explanatory variables. Break-point estimates are provided.
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
keep.class=FALSE, var.psi=FALSE, ...)
# S3 method for lm
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
keep.class=FALSE, var.psi=FALSE, ...)# S3 method for glm
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
keep.class=FALSE, var.psi=FALSE, ...)
# S3 method for numeric
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
keep.class=FALSE, var.psi=FALSE, ...,
pertV=0, centerX=FALSE, adjX=NULL, weights=NULL)
# S3 method for ts
stepmented(obj, seg.Z, psi, npsi, fixed.psi=NULL, control = seg.control(),
keep.class=FALSE, var.psi=FALSE, ...,
pertV=0, centerX=FALSE, adjX=NULL)
The returned object is of class "stepmented" which inherits
from the class "lm" or "glm" depending on the class of obj
. When only.mean=FALSE
, it is a list having two 'stepmented' fits (for the mean and for the dispersion submodels).
An object of class "stepmented" is a list containing the components of the
original object obj
with additionally the followings:
estimated break-points and relevant (approximate) standard errors (on the continuum)
the rounded estimated break-points (see Note, below)
number of iterations employed
difference in the objective function when the algorithm stops
the model frame
a list or a vector including the breakpoint estimates at each step
the integer vector containing the seed just before the bootstrap resampling. Returned only if bootstrap restart is employed
Other components are not of direct interest of the user
A standard `linear' regression model of class "lm" or "glm". Alternatively, a simple "ts" object or a simple data vector may be supplied.
the stepmented variables(s), i.e. the numeric covariate(s) understood to have a piecewise-constant relationship with response. It is a formula with no response variable, such as seg.Z=~x
or seg.Z=~x1+x2
. Currently, formulas involving functions,
such as seg.Z=~log(x1)
, or selection operators, such as seg.Z=~d[,"x1"]
or seg.Z=~d$x1
, are not allowed. Also, variable names formed by U
or V
only (with or without numbers ) are not permitted. If missing, the index variable id=1,2,..,n
is used. For stepmented.ts
, seg.Z
is usually unspecified as the (time) covariate is obtained by the ts
object itself.
starting values for the breakpoints to be estimated. If there is a single stepmented variable specified in seg.Z
, psi
can be a numeric vector, and it can be missing when 1 breakpoint has to be estimated (and the median of the stepmented variable is used as a starting value). If seg.Z
includes several covariates, psi
has to be specified as a named list of vectors whose names have to match the variables in the seg.Z
argument. Each vector of such list includes starting values for the break-point(s) for the corresponding variable in seg.Z
. A NA
value means that `K
' quantiles (or equally spaced values) are used as starting values; K
is fixed via the seg.control
auxiliary function.
A named vector or list meaning the number (and not locations) of breakpoints to be estimated. The starting values will be internally computed via the quantiles or equally spaced values, as specified in argument quant
in seg.control
. npsi
can be missing and npsi=1
is assumed for all variables specified in seg.Z
. If psi
is provided, npsi
is ignored.
An optional named list including the breakpoint values to be kept fixed during the estimation procedure. The names should be a subset of (or even the same) variables specified in seg.Z
. If there is a single variable in seg.Z
, a simple numeric vector can be specified. Note that, in addition to the values specified here, stepmented
will estimate additional breakpoints. To keep fixed all breakpoints (to be specified in psi
) use it.max=0
in seg.control
a list of parameters for controlling the fitting process.
See the documentation for seg.control
for details.
logical value indicating if the final fit returned by stepmented.default
should keep the class 'stepmented
' (along with the class of the original fit obj
). Ignored by the stepmented methods.
optional arguments (to be ignored safely). Notice specific arguments relevant to the original call (via lm
or glm
for instance), such as weights
or offet
, have to be included in the starting model obj
.
Only for stepmented.ts
and stepmented.numeric
.
Only for stepmented.ts
and stepmented.numeric
. If TRUE
, the covariate is centered before fitting.
Only for stepmented.ts
and stepmented.numeric
. If the response vector leads to covariate with large values (such as years for ts objects), adjX=TRUE
will shift the covariate to have a zero origin. Default is NULL
which means TRUE
if the minimum of covariate is 1000 or larger.
logical. If TRUE
, the estimate covariance matrix is also computed via vcov.stepmented
, thus the breakpoint standard errors are also included in the psi
component of the returned object. Default is FALSE
, as computing the estimate covariance matrix is somewhat time-consuming when the sample size is large.
possible weights to include in the estimation process (only for stepmented.numeric
).
Vito M. R. Muggeo, vito.muggeo@unipa.it (based on original code by Salvatore Fasola)
Given a linear regression model (usually of class "lm" or "glm"), stepmented tries to estimate
a new regression model having piecewise-constant (i.e. step-function like) relationships with the variables specified in seg.Z
.
A stepmented relationship is defined by the mean level
parameters and the break-points where the mean level changes. The number of breakpoints
of each stepmented relationship depends on the psi
argument, where initial
values for the break-points must be specified. The model
is estimated simultaneously yielding point estimates and relevant approximate
standard errors of all the model parameters, including the break-points.
stepmented
implements the algorithm described in Fasola et al. (2018) along with bootstrap restarting (Wood, 2001) to escape local optima. The procedure turns out to be particularly appealing and probably efficient when there are two or more covariates exhibiting different change points to be estimated.
Fasola S, Muggeo VMR, Kuchenhoff H (2018) A heuristic, iterative algorithm for change-point detection in abrupt change models, Computational Statistics 33, 997--1015
n=20
x<-1:n/n
mu<- 2+ 1*(x>.6)
y<- mu + rnorm(n)*.8
#fitting via regression model
os <-stepmented(lm(y~1),~x)
y<-ts(y)
os1<- stepmented(y) #the 'ts' method
os2<- stepmented(y, npsi=2)
#plot(y)
#plot(os1, add=TRUE)
#plot(os2, add=TRUE, col=3:5)
### Example with (poisson) GLM
y<- rpois(n,exp(mu))
o<-stepmented(glm(y~1,family=poisson))
plot(o, res=TRUE)
if (FALSE) {
## Example using the (well-known) Nile dataset
data(Nile)
plot(Nile)
os<- stepmented(Nile)
plot(os, add=TRUE)
### Example with (binary) GLM (example from the package stepR)
set.seed(1234)
y <- rbinom(200, 1, rep(c(0.1, 0.7, 0.3, 0.9), each=50))
o<-stepmented(glm(y~1,family=binomial), npsi=3)
plot(o, res=TRUE)
### Two stepmented covariates (with 1 and 2 breakpoints); z has also an additional linear effect
n=100
x<-1:n/n
z<-runif(n,2,5)
mu<- 2+ 1*(x>.6)-2*(z>3)+3*(z>4)+z
y<- mu + rnorm(n)*.8
os <-stepmented(lm(y~z),~x+z, npsi=c(x=1,z=2))
os
summary(os)
## see ?plot.stepmented
}
Run the code above in your browser using DataLab