Learn R Programming

dprep (version 3.0.2)

Data Pre-Processing and Visualization Functions for Classification

Description

Data preprocessing techniques for classification. Functions for normalization, handling of missing values,discretization, outlier detection, feature selection, and data visualization are included.

Copy Link

Version

Install

install.packages('dprep')

Monthly Downloads

45

Version

3.0.2

License

GPL

Maintainer

Edgar Acuna

Last Published

November 24th, 2015

Functions in dprep (3.0.2)

mo3

The third moment of a multivariate distribution
Shuttle

The Shuttle dataset
mo4

The fourth moment of a multivariate distribution
census

census
ce.mimp

Mean or median imputation
disc.mentr

Discretization using the minimum entropy criterion
hepatitis

The hepatitis dataset
breastw

The Breast Wisconsin dataset
unor

Auxiliary function for performing Holte's 1R discretization
maxlof

Detection of multivariate outliers using the LOF algorithm
knngow

K-nn classification using Gower distance
top

Auxiliary function for Bay's Ouylier Detection Algorithm
inconsist

Computing the inconsistency measure
rangenorm

range normalization
nnmiss

Auxiliary function for knn imputation
imagmiss

Visualization of Missing Data
finco

FINCO Feature Selection Algorithm
crx

crx
dist.to.knn

Auxiliary function for the LOF algorithm.
arboleje

Predicting a bank's decision to give a loan for buying a car.
colon

Alon et al.'s colon dataset
disc2

Auxiliary function for performing discretization using equal frequency
autompg

The Auto MPG dataset
cv10knn2

Auxiliary function for sequential feature selection
reliefcont

Feature selection by the Relief Algorithm for datasets with only continuous features
row.matches

Finding rows in a matrix equal to a given vector
crossval

Cross validation estimation of the misclassification error
heartc

The Heart Cleveland dataset
cv10rpart2

Auxiliary function for sequential feature selection
mmnorm

Min-max normalization
tchisq

Auxiliary function for the Chi-Merge discretization
redundancy

Finding the unique observations in a dataset along with their fequencies
moda

Calculating the Mode
clean

Dataset's cleaning
disc.1r

Discretization using the Holte's 1R method
near3

Auxiliary function for the reliefcat function
vehicle

The Vehicle dataset
sbs1

One-step sequential backward selection
acugow

Gower distance from a vector to a matrix
decscale

Decimal Scaling
bupa

The Bupa dataset
discretevar

Performs Minimum Entropy discretization for a given attribute
radviz2d

Radial Coordinate Visualization
distancia

Vector-Vector Euclidiean Distance Function
reliefcat

Feature selection by the Relief Algorithm for datasets containing nominal features
znorm

Z-score normalization
signorm

Sigmoidal Normalization
combinations

Constructing distinct permutations
midpoints1

Auxiliary function for computing minimun entropy discretization
robout

Outlier Detection with Robust Mahalonobis distance
lofactor

Local Outlier Factor
baysout

Outlier detection using Bay and Schwabacher's algorithm.
outbox

Detecting outliers through boxplots of the features.
cvnaiveBayesd

Crossvalidation estimation error for the naive Bayes classifier.
sfs

Sequential Forward Selection
relief

RELIEF Feature Selection
vvalen1

Auxiliary function for computing the Van Valen's homocedasticity test
circledraw

circledraw
sffs

Sequential Floating Forward Method
distancia1

Vector-Vector Manhattan Distance Function
cv10mlp

10-fold cross validation error estimation for the multilayer perceptron classifier
cv10lda2

Auxiliary function for sequential forward selection
disc.ef

Discretization using the method of equal frequencies
near1

Auxiliary function for the reliefcont function
ec.knnimp

Imputation using k-nearest neighbors.
mardia

The Mardia's test of normality
sfs1

One-step sequential forward selection
softmaxnorm

Softmax Normalization
diabetes

The Pima Indian Diabetes dataset
score

Score function used in Bay's algorithm for outlier detection
star3d

Data Visuaization using star coordinates in three dimensions
reachability

Function for computing the reachability measure in the LOF algortihm
eje1dis

Basic example for discriminant analysis
landsat

The landsat Satellite dataset
disc.ew

Discretization using the equal width method
vvalen

The Van Valen test for equal covariance matrices
starcoord

The star coordinates plot
ionosphere

The Ionosphere dataset
sonar

The Sonar dataset
cv10log

10-fold cross validation estimation error for the classifier based on logistic regression
arboleje1

Predicting a bank's decision to give a loan for buying a car.
knneigh.vect

Auxiliary function for computing the LOF measure.
mahaout

Multivariate outlier detection through the boxplot of the Mahalanobis distance
dprep-package

Data Preprocessing for supervised classification
surveyplot

Surveyplot
chiMerge

Discretization using the Chi-Merge method
srbct

Khan et al.'s small round blood cells dataset
ce.impute

Imputation in supervised classification
lvf

Las Vegas Filter
parallelplot

Parallel Coordinate Plot