The linearity measures try to quantify if it is possible to separate the labels by a hyperplane or linear function. The underlying assumption is that a linearly separable problem can be considered simpler than a problem requiring a non-linear decision boundary.
linearity(...)# S3 method for default
linearity(x, y, measures = "all", summary = c("mean",
"sd"), ...)
# S3 method for formula
linearity(formula, data, measures = "all",
summary = c("mean", "sd"), ...)
Not used.
A data.frame contained only the input attributes.
A response vector with one value for each row/component of x.
A list of measures names or "all"
to include all them.
A list of summarization functions or empty for all values. See
summarization method to more information. (Default:
c("mean", "sd")
)
A formula to define the output column.
A data.frame dataset contained the input attributes and class.
A list named by the requested linearity measure.
The following classification measures are allowed for this method:
Sum of the error distance by linear programming (L1) computes the sum of the distances of incorrectly classified examples to a linear boundary used in their classification.
Error rate of linear classifier (L2) computes the error rate of the linear SVM classifier induced from dataset.
Non-linearity of a linear classifier (L3) creates a new dataset randomly interpolating pairs of training examples of the same class and then induce a linear SVM on the original data and measure the error rate in the new data points.
The following regression measures are allowed for this method:
Mean absolute error (L1) averages the absolute values of the residues of a multiple linear regressor.
Residuals variance (L2) averages the square of the residuals from a multiple linear regression.
Non-linearity of a linear regressor (L3) measures how sensitive the regressor is to the new randomly interpolated points.
Albert Orriols-Puig, Nuria Macia and Tin K Ho. (2010). Documentation for the data complexity library in C++. Technical Report. La Salle - Universitat Ramon Llull.
Other complexity-measures: balance
,
correlation
, dimensionality
,
neighborhood
, network
,
overlapping
, smoothness
# NOT RUN {
## Extract all linearity measures for classification task
data(iris)
linearity(Species ~ ., iris)
## Extract all linearity measures for regression task
data(cars)
linearity(speed ~ ., cars)
# }
Run the code above in your browser using DataLab