Learn R Programming

DescTools (version 0.99.37)

DescTools-package: Tools for Descriptive Statistics and Exploratory Data Analysis

Description

DescTools is an extensive collection of miscellaneous basic statistics functions and comfort wrappers not available in the R basic system for efficient description of data. The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. Special attention was paid to the integration of various approaches to the calculation of confidence intervals. For most basic statistics functions, variants are included that allow the use of weights. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel. A considerable part of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'CamelStyle' was consequently applied to functions borrowed from contributed R packages as well.

Feedback, feature requests, bugreports and other suggestions are welcome! Please report problems to Stack Overflow using tag [desctools] or directly to the maintainer.

Arguments

Warning

This package is still under development. Although the code seems meanwhile quite stable, until release of version 1.0 (which is expected in hmm: near future?) you should be aware that everything in the package might be subject to change. Backward compatibility is not yet guaranteed. Functions may be deleted or renamed and new syntax may be inconsistent with earlier versions. By release of version 1.0 the "deprecated-defunct process" will be installed.

MS-Office

To make use of MS-Office features you must have Office in one of its variants installed. All Wrd*, XL* and Pp* functions require as well the package RDCOMClient to be installed. Hence the use of these functions is restricted to Windows systems. RDCOMClient can be installed with:

install.packages("RDCOMClient", repos="http://www.omegahat.net/R")

The omegahat repository does not benefit from the same update service as CRAN. So you may be forced to install a package compiled with an earlier version, which usually is not a problem. Use e.g. for R 3.6x/R 4.0

url <- "http://www.omegahat.net/R/bin/windows/contrib/3.5.1/RDCOMClient_0.93-0.zip"
url <- "http://www.omegahat.net/R/bin/windows/contrib/4.0/RDCOMClient_0.94-0.zip"
install.packages(url, repos=NULL, type="binary")

RDCOMClient does not exist for Mac or Linux, sorry.

Details

A grouped list of the functions:

Operators, calculus, transformations:
%()% Between operators determine if a value lies within a range [a,b]
%)(% Outside operators: %)(%, %](%, %)[%, %][%
%nin% "not in" operator
%overlaps% Do two collections have common elements?
%like%, %like any% Simple operator to search for a specified pattern
%^% Powers of matrices
Interval The number of days of the overlapping part
of two date periods
AUC Area under the curve
Primes Find all primes less than n
Factorize Prime factorization of integers
Divisors All divisors of an integer
GCD Greatest common divisor
LCM Least common multiple
Permn Determine all possible permutations of a set
Fibonacci Generates single Fibonacci numbers or a Fibonacci sequence
DigitSum Digit sum of a number
Frac Return the fractional part of a numeric value
Ndec Count decimal places of a number
MaxDigits Maximum used digits for a vector of numbers
Prec Precision of a number
BoxCox, BoxCoxInv Box Cox transformation and its inverse transformation
BoxCoxLambda Return the optimal lambda for a BoxCox transformation
LogSt, LogStInv Calculate started logarithmic transformation and it's inverse
Logit, LogitInv Generalized logit and inverse logit function
LinScale Simple linear scaling of a vector x
Winsorize Data cleaning by winsorization
Trim Trim data by omitting outlying observations
CutQ Cut a numeric variable into quartiles or other quantiles
Recode Recode a factor with altered levels
Rename Change name(s) of a named object
Sort Sort extension for matrices and data.frames
SortMixed, OrderMixed Mixed sort order
DenseRank Calculate ranks in consecutive order (no ties)
PercentRank Calculate the percent rank
RoundTo Round to a multiple
Large, Small Returns the kth largest, resp. smallest values
HighLow Combines Large and Small.
Rev Reverses the order of rows and/or columns of a matrix or a data.frame
Untable Recreates original list based on a n-dimensional frequency table
CollapseTable Collapse some rows/columns in a table.
Dummy Generate dummy codes for a factor
FisherZ, FisherZInv Fisher's z-transformation and its inverse
Midx Calculate sequentially the midpoints of the elements of a vector
Unwhich Inverse function to which, create a logical vector/matrix from indices
Vigenere Implements a Vigenere cypher, both encryption and decryption
BinTree, PlotBinTree Create and plot a binary tree structure with a given length

Information and manipulation functions:
AllDuplicated Find all values involved in ties
Closest Return the value in a vector being closest to a given one
Coalesce Return the first value in a vector not being NA
ZeroIfNA, NAIfZero Replace NAs by 0, resp. vice versa
Impute Replace NAs by the median or another value
LOCF Imputation of datapoints following the "last observation
carried forward" rule
CombN Returns the number of subsets out of a list of elements
CombSet Generates all possible subsets out of a list of elements
CombPairs Generates all pairs out of one or two sets of elements
SampleTwins Create sample using stratifying groups
RndPairs Create pairs of correlated random numbers
RndWord Produce random combinations of characters
IsNumeric Check a vector for being numeric, zero Or a whole number
IsWhole Is x a whole number?
IsDichotomous Check if x contains exactly 2 values
IsOdd Is x even or odd?
IsPrime Is x a prime number?
IsZero Is numeric(x) == 0, say x < machine.eps?
IsEuclid Check if a distance matrix is euclidean
Label, Unit Get or set the label, resp. unit, attribute of an object
Abind Bind matrices to n-dimensional arrays
Append Append elements to several classes of objects
VecRot, VecShift Shift the elements of a vector in a circular mode to the right
or to the left by n characters.
Clockwise Transform angles from counter clock into clockwise mode
split.formula A formula interface for the base function split
reorder.factor Reorder the levels of a factor
ToLong, ToWide Simple reshaping of a vector
SetNames Set the names, rownames or columnnames in an object and return it
Some Return some randomly chosen elements of an object
SplitAt Split a vector into several pieces at given positions
SplitToCol Splits the columns of a data frame using a split character
SplitPath Split a path string in drive, path, filename
Str Compactly display the structure of any R object
TextToTable Converts a string to a table
String functions:
StrCountW Count the words in a string
StrTrim Delete white spaces from a string
StrTrunc Truncate string on a given length and add ellipses if it really
was truncated
StrLeft, StrRight Returns the left/right part or the a string.
StrAlign Align strings to the left/right/center or to a given character
StrAbbr Abbreviates a string
StrCap Capitalize the first letter of a string
StrPad Fill a string with defined characters to fit a given length
StrRev Reverse a string
StrChop Split a string by a fixed number of characters.
StrExtract Extract a part of a string, defined as regular expression.
StrVal Extract numeric values from a string
StrIsNumeric Check whether a string does only contain numeric data
StrPos Find position of first occurrence of a string in another one
StrDist Compute Levenshtein or Hamming distance between strings
FixToTable Create table out of a running text, by using columns of spaces as delimiter
Conversion functions:
AscToChar, CharToAsc Converts ASCII codes to characters and vice versa
DecToBin, BinToDec Converts numbers from binmode to decimal and vice versa
DecToHex, HexToDec Converts numbers from hexmode to decimal and vice versa
DecToOct, OctToDec Converts numbers from octmode to decimal and vice versa
DegToRad, RadToDeg Convert degrees to radians and vice versa
CartToPol, PolToCart Transform cartesian to polar coordinates and vice versa
CartToSph, SphToCart Transform cartesian to spherical coordinates and vice versa
RomanToInt Convert roman numerals to integers
RgbToLong, LongToRgb Convert a rgb color to a long number and vice versa
ColToGray, ColToGrey Convert colors to gcrey/grayscale
ColToHex, HexToCol Convert a color into hex string
HexToRgb Convert a hexnumber to an RGB-color
ColToHsv R color to HSV conversion
ColToRgb, RgbToCol Color to RGB conversion and back
ConvUnit Return the most common unit conversions

Colors:
SetAlpha Add transperancy (alpha channel) to a color.
ColorLegend Add a color legend to a plot
FindColor Get color on a defined color range
MixColor Get the mix of two colors
TextContrastColor Choose textcolor depending on background color
Pal Some custom color palettes
Plots (low level):
Canvas Canvas for geometric plotting
Mar Set margins more comfortably.
Asp Return aspect ratio of the current plot
LineToUser Convert line coordinates to user coordinates
lines.loess Add a loess smoother and its CIs to an existing plot
lines.lm Add the prediction of linear model and its CIs to a plot
lines.smooth.spline Add the prediction of a smooth.spline and its CIs to a plot
BubbleLegend Add a legend for bubbles to a bubble plot
TitleRect Add a main title to a plot surrounded by a rectangular box
BarText Add the value labels to a barplot
ErrBars Add horizontal or vertical error bars to an existing plot
DrawArc, DrawRegPolygon Draw elliptic, circular arc(s) or regular polygon(s)
DrawCircle, DrawEllipse Draw a circle, a circle annulus or a sector or an annulus
DrawBezier Draw a Bezier curve
DrawBand Draw confidence band
BoxedText Add text surrounded by a box to a plot
Rotate Rotate a geometric structure
SpreadOut Spread out a vector of numbers so that there is a minimum
interval between any two elements. This can be used
to place textlabels in a plot so that they do not overlap.
IdentifyA Helps identifying all the points in a specific area.
identify.formula Formula interface for identify.
PtInPoly Identify all the points within a polygon.
ConnLines Calculate and insert connecting lines in a barplot
AxisBreak Place a break mark on an axis
Shade Produce a shaded curve
Stamp Stamp the current plot with Date/Time/Directory or any other expression
Plots (high level):
PlotACF, PlotGACF Create a combined plot of a time series including its
autocorrelation and partial autocorrelation
PlotMonth Plot seasonal effects of a univariate time series
PlotArea Create an area plot
PlotBag Create a two-dimensional boxplot
PlotBagPairs Produce pairwise 2-dimensional boxplots (bagplot)
PlotBubble Draw a bubble plot
PlotCandlestick Plot candlestick chart
PlotCirc Create a circular plot
PlotCorr Plot a correlation matrix
PlotDot Plot a dotchart with confidence intervals
PlotFaces Produce a plot of Chernoff faces
PlotFdist Frequency distribution plot, combination of histogram,
boxplot and ecdf.plot
PlotMarDens Scatterplot with marginal densities
PlotMultiDens Plot multiple density curves
PlotPolar Plot values on a circular grid
PlotFun Plot mathematical expression or a function
PolarGrid Plot a grid in polar coordinates
PlotPyramid Pyramid plot (back-back histogram)
PlotTreemap Plot of a treemap.
PlotVenn Plot a Venn diagram
PlotViolin Plot violins instead of boxplots
PlotQQ QQ-plot for an optional distribution
PlotWeb Create a web plot
PlotTernary Create a triangle or ternary plot
PlotMiss Plot missing values
PlotECDF Plot empirical cumulative distribution function
PlotLinesA Plot the columns of one matrix against the columns of another
PlotLog Create a plot with logarithmic axis and log grid
PlotMosaic Plots a mosaic describing a contingency table in array form
Distributions:
_Benf Benford distribution, including qBenf, dBenf, rBenf
_ExtrVal Extreme value distribution (dExtrVal)
_Frechet Frechet distribution (dFrechet)
_GenExtrVal Generalized Extreme Value Distribution (dGenExtrVal)
_GenPareto Generalized Pareto Distribution (dGenPareto)
_Gompertz Gompertz distribution (dGompertz)
_Gumbel Gumbel distribution (dGumbel)
_NegWeibull Negative Weibull distribution (dNegWeibull)
_Order Distributions of Order Statistics (dOrder)
_RevGumbel Reverse Gumbel distribution (dRevGumbel),
_RevGumbelExp Expontial reverse Gumbel distribution (quantile only)
_RevWeibull Reverse Weibull distribution (dRevWeibull)
Statistics:
Freq Univariate frequency table
PercTable Bivariate percentage table
Margins (Extended) margin tables of a table
ExpFreq Expected frequencies of a n-dimensional table
Mode Mode, the most frequent value (including frequency)
Gmean, Gsd Geometric mean and geometric standard deviation
Hmean Harmonic Mean
Median Extended median function supporting weights and ordered factors
HuberM, TukeyBiweight Huber M-estimator of location and Tukey's biweight robust mean
HodgesLehmann the Hodges-Lehmann estimator
HoeffD Hoeffding's D statistic
MeanSE Standard error of mean
MeanCI, MedianCI Confidence interval for the mean and median
MeanDiffCI Confidence interval for the difference of two means
MoveAvg Moving average
MeanAD Mean absolute deviation
VarCI Confidence interval for the variance
CoefVar Coefficient of variation and its confidence interval
RobScale Robust data standardization
Range (Robust) range
BinomCI, MultinomCI Confidence intervals for binomial and multinomial proportions
BinomDiffCI Calculate confidence interval for a risk difference
BinomRatioCI Calculate confidence interval for the ratio of binomial proportions.
PoissonCI Confidence interval for a Poisson lambda
Skew, Kurt Skewness and kurtosis
YuleQ, YuleY Yule's Q and Yule's Y
TschuprowT Tschuprow's T
Phi, ContCoef, CramerV Phi, Pearson's Contingency Coefficient and Cramer's V
GoodmanKruskalGamma Goodman Kruskal's gamma
KendallTauA Kendall's tau-a
KendallTauB Kendall's tau-b
StuartTauC Stuart's tau-c
SomersDelta Somers' delta
Lambda Goodman Kruskal's lambda
GoodmanKruskalTau Goodman Kruskal's tau
UncertCoef Uncertainty coefficient
Entropy, MutInf Shannon's entropy, mutual information
DivCoef, DivCoefMax Rao's diversity coefficient ("quadratic entropy")
TheilU Theil's U1 and U2 coefficient
Assocs Combines the association measures above.
OddsRatio, RelRisk Odds ratio and relative risk
ORToRelRisk Transform odds ratio to relative risk
CohenKappa, KappaM Cohen's Kappa, weighted Kappa and Kappa for
more than 2 raters
CronbachAlpha Cronbach's alpha
ICC Intraclass correlations
KrippAlpha Return Kripp's alpha coefficient
KendallW Compute the Kendall coefficient of concordance
Lc Calculate and plot Lorenz curve
Gini, Atkinson Gini- and Atkinson coefficient
Herfindahl, Rosenbluth Herfindahl- and Rosenbluth coefficient
GiniSimpson Compute Gini-Simpson Coefficient
CorCI Confidence interval for Pearson's correlation coefficient
CorPart Find the correlations for a set x of variables with set y removed
CorPolychor Polychoric correlation coefficient
SpearmanRho Spearman rank correlation and its confidence intervals
ConDisPairs Return concordant and discordant pairs of two vectors
FindCorr Determine highly correlated variables
CohenD Cohen's Effect Size
EtaSq Effect size calculations for ANOVAs
Contrasts Generate pairwise contrasts for using in a post-hoc test
Strata Stratified sampling with equal/unequal probabilities
Outlier Outliers following Tukey's boxplot definition
LOF Local outlier factor
BrierScore Brier score, assessing the quality of predictions of binary events
Cstat C statistic, equivalent to the area under the ROC curve)
CCC Lin's concordance correlation coef for agreement on a continuous measure
MAE Mean absolute error
MAPE, SMAPE Mean absolute and symmetric mean absolute percentage error
MSE, RMSE Mean squared error and root mean squared error
NMAE, NMSE Normalized mean absolute and mean squared error
Conf Confusion matrix, a cross-tabulation of observed and predicted classes
with associated statistics
Sens, Spec Sensitivity and specificity
PseudoR2 Variants of pseudo R squared statistics: McFadden, Aldrich-Nelson,
Nagelkerke, CoxSnell, Effron, McKelvey-Zavoina, Tjur
Mean, SD, Var, IQRw Variants of base statistics, allowing to define weights: Mean,
Quantile, MAD, Cor standard deviation, variance, quantile, mad, correlation
VIF, StdCoef Variance inflation factors and standardised coefficents for linear models
Tests:
SignTest Signtest to test whether two groups are equally sized
ZTest Z--test for known population variance
TTestA Student's t-test based on sample statistics
JonckheereTerpstraTest Jonckheere-Terpstra trend test for medians
PageTest Page test for ordered alternatives
CochranQTest Cochran's Q-test to find differences in matched sets
of three or more frequencies or proportions.
VarTest ChiSquare test for one variance and F test for two variances
SiegelTukeyTest Siegel-Tukey test for equality in variability
SiegelTukeyRank Calculate Siegel-Tukey's ranks (auxiliary function)
LeveneTest Levene's test for homogeneity of variance
MosesTest Moses Test of extreme reactions
RunsTest Runs test for detecting non-randomness
DurbinWatsonTest Durbin-Watson test for autocorrelation
BartelsRankTest Bartels rank test for randomness
JarqueBeraTest Jarque-Bera Test for normality
AndersonDarlingTest Anderson-Darling test for normality
CramerVonMisesTest Cramer-von Mises test for normality
LillieTest Lilliefors (Kolmogorov-Smirnov) test for normality
PearsonTest Pearson chi-square test for normality
ShapiroFranciaTest Shapiro-Francia test for normality
MHChisqTest Mantel-Haenszel Chisquare test
StuartMaxwellTest Stuart-Maxwell marginal homogeneity test
LehmacherTest Lehmacher marginal homogeneity test
CochranArmitageTest Cochran-Armitage test for trend in binomial proportions
BreslowDayTest, WoolfTest Test for homogeneity on 2x2xk tables over strata
PostHocTest Post hoc tests by Scheffe, LSD, Tukey for a aov-object
ScheffeTest Multiple comparisons Scheffe test
DunnTest Dunn's test of multiple comparisons
DunnettTest Dunnett's test of multiple comparisons
ConoverTest Conover's test of multiple comparisons (following a kruskal test)
NemenyiTest Nemenyi's test of multiple comparisons
HotellingsT2Test Hotelling's T2 test for the one and two sample case
YuenTTest Yuen's robust t-Test with trimmed means and winsorized variances
BarnardTest Barnard's test for 2x2 tables
BreuschGodfreyTest Breusch-Godfrey test for higher-order serial correlation.
GTest Chi-squared contingency table test and goodness-of-fit test
HosmerLemeshowTest Hosmer-Lemeshow goodness of fit tests
VonNeumannTest Von Neumann's successive difference test
Date functions:
day.name, day.abb Defined names of the days
AddMonths, AddMonthsYM Add a number of months to a given date
IsDate Check whether x is a date object
IsWeekend Check whether x falls on a weekend
IsLeapYear Check whether x is a leap year
LastDayOfMonth Return the last day of the month of the date x
DiffDays360 Calculate the difference of two dates using the 360-days system
Date Create a date from numeric representation of year, month, day
Day, Month, Year Extract part of a date
Hour, Minute, Second Extract part of time
Week, Weekday Returns ISO week and weekday of a date
Quarter Quarter of a date
Timezone Timezone of a POSIXct/POSIXlt date
YearDay, YearMonth The day in the year of a date
Now, Today Get current date or date-time
HmsToSec, SecToHms Convert h:m:s times to seconds and vice versa
Overlap Determine if and how extensively two date ranges overlap
Zodiac The zodiac sign of a date :-)
Finance functions:
OPR One period returns (simple and log returns)
NPV Net present value
NPVFixBond Net present value for fix bonds
IRR Internal rate of return
YTM Return yield to maturity for a bond
SLN, DB, SYD Several methods of depreciation of an asset
GUI-Helpers:
PasswordDlg Display a dialog containing an edit field, showing only ***.
Reporting, InOut:
CatTable Print a table with the option to have controlled linebreaks
Format, Fmt Easy format for numbers and dates
Desc Produce a rich description of an object
Abstract Display compact overview of the structure of a data frame
TMod Create comparison table for (general) linear models
TOne Create "Table One"" describing baseline characteristics
GetNewWrd, GetNewXL, GetNewPP Create a new Word, Excel or PowerPoint Instance
GetCurrWrd, GetCurrXL, GetCurrPP Get a handle to a running Word, Excel or PowerPoint instance
WrdKill, XLKill Ends a (possibly hidden) Word/Excel process
IsValidHwnd Check if the handle to a MS Office application is valid or outdated
WrdCaption Insert a title in Word
WrdFont Get and set the font for the current selection in Word
WrdParagraphFormat Get and set the paragraph format
WrdTable Create a table in Word
WrdCellRange Select a cell range of a table in Word
WrdMergeCells Merge cells of a table in Word
WrdFormatCells Format selected cells of a table in word
WrdTableBorders Set or edit table border style of a table in Word
ToWrd, ToXL Mord flexible wrapper to send diverse objects to Word, resp. Excel
WrdPlot Insert the active plot to Word
WrdInsertBookmark Insert a new bookmark in a Word document
WrdDeleteBookmark Delete an existing bookmark in a Word document
WrdGoto Place cursor to a specific bookmark, or another text position.
WrdUpdateBookmark Update the text of a bookmark's range
WrdSaveAs Saves documents in Word
WrdStyle Get and set the style of a paragraph in Word
XLDateToPOSIXct Convert XL-Date format to POSIXct format
XLGetRange Get the values of one or several cell range(s) in Excel
XLGetWorkbook Get the values of all sheets of an Excel workbook
XLView Use Excel as viewer for a data.frame
PpPlot Insert active plot to PowerPoint
PpAddSlide Adds a slide to a PowerPoint presentation
PpText Adds a textbox with text to a PP-presentation
ParseSASDatalines Parse a SAS "datalines" statement to read data
Tools:
PairApply Helper for calculating functions pairwise
LsFct, LsObj List the functions (or the data, all objects) of a package
FctArgs Retrieve the arguments of a functions
InDots Check if an argument is contained in ... argument and return it's value
ParseFormula Parse a formula and return the splitted parts of if
Recycle Recycle a list of elements to the maximal found dimension
Keywords Get the keywords of a man page
SysInfo Get some more information about system and environment
DescToolsOptions Get the DescTools specific options
PDFManual Get the pdf-manual of any package on CRAN and open it
Data:
d.pizza Synthetic dataset created for testing the description
d.whisky of Scotch Single Malts
Reference Data:
d.units, d.prefix Unit conversion factors and metric prefixes
d.periodic Periodic table of elements
d.countries ISO 3166-1 country codes
roulette, cards, tarot Datasets for probabilistic simulation

Examples

Run this code
# NOT RUN {
# ******************************************************
# There are no examples defined here. But see the demos:
#
# demo(describe)
# demo(plots))
#
# ******************************************************
# }

Run the code above in your browser using DataLab