visualFit: Fit a mixture of regressions model by “visual” means.

Description

Displays a plot of the data and invites the user to “click” on points judged to lie on the various components.

Usage

visualFit(fmla, data=NULL, ncomp, eqVar=FALSE, chsnPts=NULL,
      keepPlotVisible = FALSE)

Arguments

fmla

A formula specifying the regression model to be fitted.

data

A list or data frame in which the variables specified by fmla may be searched for. Variables not found in data are searched for in the global environment.

ncomp

Positive integer scalar. The number of components in the mixture which is to be fitted.

eqVar

Logical scalar; should the error variance be the same for all components? (The alternative is that each component should be allowed to have a different error variance.)

chsnPts

A list with ncomp components each of which is a list of length two with components x and y. Each of x and y is a vector of length two, constituting the \(x\) and \(y\) coordinates respectively of two points on a line that presumably underlies the corresponding component.

keepPlotVisible

Logical scalar. Should the plot of the data, produced by this function, be kept visible after the model has been “fitted”? (Rather than being dismissed by dev.off().)

Value

An object of class "mixreg". (See mixreg().) Components nsteps and converged are set to NA. Component data has an extra column groups appended to it. This column is a factor specifying the components assigned to the points on the basis of distances from the lines determined by the chosen points.

The value also has an attribute "chsnPts". This is the list of points judged by the user to lie on the component lines (or the value of the chsnPts argument if this was supplied).

Details

If the model involves more than one predictor, or if the specified predictor is a matrix with more than one column, then an error is thrown. This function is intended for use only with one-variable regression.

If there is an intercept in the model, then for each component (numbered 1 to ncomp) the user is invited to click on two points judged to lie on a line underlying that component. If there is no intercept, then the user is invited to click on a single point for each component, with the origin (0,0) taken (silently) to be the second point needed to determine the line.

The fit that this function returns is calculated by assigning a component to each point in the data set, based on which of the visually determined lines that point is closest to.

If eqVar is TRUE then the model is constructed using a factor, whose entries are these assigned components, as a predictor (along with the “x” variable in fmla) in a call to lm(). If code eqVar is FALSE then a model is fitted separately to each component. (See the code for details.) “Obviously” the linear coefficient estimates will be the same in either case. Only the error variance estimates will differ.

If eq.var is TRUE then the number of parameters in the model, as used in the calculation of aic, is M = 2*K + (K-1) + 1 = 3*K when there is an intercept term and M = K + (K-1) + 1 = 2*K when there is no intercept term.

If eq.var is FALSE then the number of parameters is M = 2*K + (K-1) + K = 4*K - 1 if there is an intercept, and M = K + (K-1) + K = 3*K - 1 if there is no intercept.

The argument chsnPts allows one to use this function in a non-interactive session by creating and saving, a priori, an object to be supplied as the value of chsnPts. If chsnPts is supplied then the method employed isn't really “visual”, but presumably the object supplied would have been created in a visual manner. Be that as it may, this function is mainly intended to be used visually, that is without supplying chsnPts.

If chsnPts is not supplied then (“obviously”!!!) this function can be used only in an interactive session.

Examples

Run this code

# NOT RUN {
    vfita <- visualFit(plntsInf ~ aphRel,data=aphids,ncomp=2)
    plot(vfita)
    vfitk1 <- visualFit(y ~ x, data=kilnAoneOut, ncomp=3)
    cp     <- attr(vfitk1,"chsnPts")
    vfitk2 <- visualFit(y ~ x, data=kilnAoneOut, eqVar=TRUE, chsnPts=cp)
    vfitk1$parmat
    vfitk2$parmat # Same as above except for the "sigsq" column.
# }