ImputeKnn

Function that fills in all NA values using the k Nearest
Neighbours of each case with NA values.
By default it uses the values of the neighbours and 
obtains an weighted (by the distance to the case) average
of their values to fill in the unknows.
If meth='median' it uses the median/most frequent value,
instead.

models

A collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. Many of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'BigCamelCase' style was consequently applied to functions borrowed from contributed R packages as well.

Andri Signorell

DescTools

Tools for Descriptive Statistics

Ken Aho

Andreas Alfons

Nanina Anderegg

Tomas Aragon

Chandima Arachchige

Antti Arppe

Adrian Baddeley

Kamil Barton

Ben Bolker

Hans W. Borchers

Frederico Caeiro

Stephane Champely

Daniel Chessel

Leanne Chhay

Nicholas Cooper

Clint Cummins

Michael Dewey

Harold C. Doran

Stephane Dray

Charles Dupont

Dirk Eddelbuettel

Claus Ekstrom

Martin Elff

Jeff Enos

Richard W. Farebrother

John Fox

Romain Francois

Michael Friendly

Tal Galili

Matthias Gamer

Joseph L. Gastwirth

Vilmantas Gegzna

Yulia R. Gel

Sereina Graber

Juergen Gross

Gabor Grothendieck

Frank E. Harrell Jr

Richard Heiberger

Michael Hoehle

Christian W. Hoffmann

Soeren Hojsgaard

Torsten Hothorn

Markus Huerzeler

Wallace W. Hui

Pete Hurd

Rob J. Hyndman

Christopher Jackson

Matthias Kohl

Mikko Korpela

Max Kuhn

Detlew Labes

Friederich Leisch

Jim Lemon

Dong Li

Martin Maechler

Arni Magnusson

Ben Mainwaring

Daniel Malter

George Marsaglia

John Marsaglia

Alina Matei

David Meyer

Weiwen Miao

Giovanni Millo

Yongyi Min

David Mitchell

Cyril Flurin Moser

Franziska Mueller

Markus Naepflin

Danielle Navarro

Henric Nilsson

Klaus Nordhausen

Derek Ogle

Hong Ooi

Nick Parsons

Sandrine Pavoine

Tony Plate

Luke Prendergast

Roland Rapold

William Revelle

Tyler Rinker

Brian D. Ripley

Caroline Rodriguez

Nathan Russell

Nick Sabbe

Ralph Scherer

Venkatraman E. Seshan

Michael Smithson

Greg Snow

Karline Soetaert

Werner A. Stahel

Alec Stephenson

Mark Stevenson

Ralf Stubner

Matthias Templ

Duncan Temple Lang

Terry Therneau

Yves Tille

Luis Torgo

ImputeKnn function

<dl> <dt>data</dt>
<dd>A data frame with the data set</dd> <dt>k</dt>
<dd>The number of nearest neighbours to use (defaults to 10)</dd> <dt>scale</dt>
<dd>Boolean setting if the data should be scale before finding the nearest neighbours (defaults
to TRUE)</dd> <dt>meth</dt>
<dd>String indicating the method used to calculate the value to fill in each
NA. Available values are 'median' or 'weighAvg' (the default).</dd><dt>distData</dt>
<dd>Optionally you may sepecify here a data frame containing the data set
 that should be used to find the neighbours. This is usefull when
 filling in NA values on a test set, where you should use only
 information from the training set. This defaults to NULL, which means
 that the neighbours will be searched in <code>data</code></dd></dl>

Arguments

Luis Torgo <a href="/link/ltorgo%40dcc.fc.up.pt?package=DescTools&version=0.99.52" data-mini-rdoc="DescTools::ltorgo@dcc.fc.up.pt">ltorgo@dcc.fc.up.pt</a>

Author

Fill in NA values with the values of the nearest neighbours — ImputeKnn

<dl>

 <dt>data</dt>
<dd>A data frame with the data set</dd>

 <dt>k</dt>
<dd>The number of nearest neighbours to use (defaults to 10)</dd>

 <dt>scale</dt>
<dd>Boolean setting if the data should be scale before finding the nearest neighbours (defaults
to TRUE)</dd>

 <dt>meth</dt>
<dd>String indicating the method used to calculate the value to fill in each
NA. Available values are 'median' or 'weighAvg' (the default).</dd>

<dt>distData</dt>
<dd>Optionally you may sepecify here a data frame containing the data set
 that should be used to find the neighbours. This is usefull when
 filling in NA values on a test set, where you should use only
 information from the training set. This defaults to NULL, which means
 that the neighbours will be searched in <code>data</code></dd>

</dl>

Luis Torgo <a href='mailto:ltorgo@dcc.fc.up.pt'>ltorgo@dcc.fc.up.pt</a>

Fill in NA values with the values of the nearest neighbours

ImputeKnn: Fill in NA values with the values of the nearest neighbours

Description

Usage

Value

Arguments

Author

Details

References

See Also

Examples