DescTools is an extensive collection of miscellaneous basic statistics functions and comfort wrappers not available in the R basic system for efficient description of data.
The author's intention was to create a toolbox, which facilitates the (notoriously time consuming) first descriptive tasks in data analysis, consisting of calculating descriptive statistics, drawing graphical summaries and reporting the results. The package contains furthermore functions to produce documents using MS Word (or PowerPoint) and functions to import data from Excel.
A considerable part of the included functions can be found scattered in other packages and other sources written partly by Titans of R. The reason for collecting them here, was primarily to have them consolidated in ONE instead of dozens of packages (which themselves might depend on other packages which are not needed at all), and to provide a common and consistent interface as far as function and arguments naming, NA
handling, recycling rules etc. are concerned. Google style guides were used as naming rules (in absence of convincing alternatives). The 'CamelStyle' was consequently applied to functions borrowed from contributed R packages as well.
Feedback, feature requests, bugreports and other suggestions are welcome! Please report problems to Stack Overflow using tag [desctools] or directly to the maintainer.
This package is still under development. Although the code seems meanwhile quite stable, until release of version 1.0 (which is expected in hmm: new future?) you should be aware that everything in the package might be subject to change. Backward compatibility is not yet guaranteed. Functions may be deleted or renamed and new syntax may be inconsistent with earlier versions. By release of version 1.0 the "deprecated-defunct process" will be installed.
To make use of MS-Office features you must have Office in one of its variants installed.
All Wrd*
, XL*
and Pp*
functions require as well the package RDCOMClient to be installed. Hence the use of these functions is restricted to Windows systems.
RDCOMClient can be installed with:
install.packages("RDCOMClient", repos="")
The omegahat repository does not benefit from the same update service as CRAN. So you may be forced to install a package compiled with an earlier version, which usually is not a problem. Use e.g. for R 3.6x
url <- "" install.packages(url, repos=NULL, type="binary")
RDCOMClient does not exist for Mac or Linux, sorry.
A grouped list of the functions:
Operators, calculus, transformations: | |
%()% | Between operators determine if a value lies within a range [a,b] |
%)(% | Outside operators: %)(%, %](%, %)[%, %][% |
%nin% | "not in" operator |
%overlaps% | Do two collections have common elements? |
%like%, %like any% | Simple operator to search for a specified pattern |
%^% | Powers of matrices |
Interval | The number of days of the overlapping part |
of two date periods | |
AUC | Area under the curve |
Primes | Find all primes less than n |
Factorize | Prime factorization of integers |
Divisors | All divisors of an integer |
GCD | Greatest common divisor |
LCM | Least common multiple |
Permn | Determine all possible permutations of a set |
Fibonacci | Generates single Fibonacci numbers or a Fibonacci sequence |
DigitSum | Digit sum of a number |
Frac | Return the fractional part of a numeric value |
Ndec | Count decimal places of a number |
MaxDigits | Maximum used digits for a vector of numbers |
Prec | Precision of a number |
BoxCox, BoxCoxInv | Box Cox transformation and its inverse transformation |
BoxCoxLambda | Return the optimal lambda for a BoxCox transformation |
LogSt, LogStInv | Calculate started logarithmic transformation and it's inverse |
Logit, LogitInv | Generalized logit and inverse logit function |
LinScale | Simple linear scaling of a vector x |
Winsorize | Data cleaning by winsorization |
Trim | Trim data by omitting outlying observations |
CutQ | Cut a numeric variable into quartiles or other quantiles |
Recode | Recode a factor with altered levels |
Rename | Change name(s) of a named object |
Sort | Sort extension for matrices and data.frames |
SortMixed, OrderMixed | Mixed sort order |
DenseRank | Calculate ranks in consecutive order (no ties) |
PercentRank | Calculate the percent rank |
RoundTo | Round to a multiple |
Large, Small | Returns the kth largest, resp. smallest values |
HighLow | Combines Large and Small . |
Rev | Reverses the order of rows and/or columns of a matrix or a data.frame |
Untable | Recreates original list based on a n-dimensional frequency table |
CollapseTable | Collapse some rows/columns in a table. |
Dummy | Generate dummy codes for a factor |
FisherZ, FisherZInv | Fisher's z-transformation and its inverse |
Midx | Calculate sequentially the midpoints of the elements of a vector |
Unwhich | Inverse function to which , create a logical vector/matrix from indices |
Vigenere | Implements a Vigenere cypher, both encryption and decryption |
BinTree, PlotBinTree | Create and plot a binary tree structure with a given length |
Information and manipulation functions: | |
AllDuplicated | Find all values involved in ties |
Closest | Return the value in a vector being closest to a given one |
Coalesce | Return the first value in a vector not being NA |
ZeroIfNA, NAIfZero | Replace NAs by 0, resp. vice versa |
Impute | Replace NAs by the median or another value |
LOCF | Imputation of datapoints following the "last observation |
carried forward" rule | |
CombN | Returns the number of subsets out of a list of elements |
CombSet | Generates all possible subsets out of a list of elements |
CombPairs | Generates all pairs out of one or two sets of elements |
SampleTwins | Create sample using stratifying groups |
RndPairs | Create pairs of correlated random numbers |
RndWord | Produce random combinations of characters |
IsNumeric | Check a vector for being numeric, zero Or a whole number |
IsWhole | Is x a whole number? |
IsDichotomous | Check if x contains exactly 2 values |
IsOdd | Is x even or odd? |
IsPrime | Is x a prime number? |
IsZero | Is numeric(x) == 0, say x < machine.eps? |
IsEuclid | Check if a distance matrix is euclidean |
Label, Unit | Get or set the label , resp. unit , attribute of an object |
Abind | Bind matrices to n-dimensional arrays |
Append | Append elements to several classes of objects |
VecRot, VecShift | Shift the elements of a vector in a circular mode to the right |
or to the left by n characters. | |
Clockwise | Transform angles from counter clock into clockwise mode |
split.formula | A formula interface for the base function split |
reorder.factor | Reorder the levels of a factor |
ToLong, ToWide | Simple reshaping of a vector |
SetNames | Set the names, rownames or columnnames in an object and return it |
Some | Return some randomly chosen elements of an object |
SplitAt | Split a vector into several pieces at given positions |
SplitPath | Split a path string in drive, path, filename |
Str | Compactly display the structure of any R object |
TextToTable | Converts a string to a table |
String functions: | |
StrCountW | Count the words in a string |
StrTrim | Delete white spaces from a string |
StrTrunc | Truncate string on a given length and add ellipses if it really |
was truncated | |
StrLeft, StrRight | Returns the left/right part or the a string. |
StrAlign | Align strings to the left/right/center or to a given character |
StrAbbr | Abbreviates a string |
StrCap | Capitalize the first letter of a string |
StrPad | Fill a string with defined characters to fit a given length |
StrRev | Reverse a string |
StrChop | Split a string by a fixed number of characters. |
StrExtract | Extract a part of a string, defined as regular expression. |
StrVal | Extract numeric values from a string |
StrIsNumeric | Check whether a string does only contain numeric data |
StrPos | Find position of first occurrence of a string in another one |
StrDist | Compute Levenshtein or Hamming distance between strings |
FixToTable | Create table out of a running text, by using columns of spaces as delimiter |
Conversion functions: | |
AscToChar, CharToAsc | Converts ASCII codes to characters and vice versa |
DecToBin, BinToDec | Converts numbers from binmode to decimal and vice versa |
DecToHex, HexToDec | Converts numbers from hexmode to decimal and vice versa |
DecToOct, OctToDec | Converts numbers from octmode to decimal and vice versa |
DegToRad, RadToDeg | Convert degrees to radians and vice versa |
CartToPol, PolToCart | Transform cartesian to polar coordinates and vice versa |
CartToSph, SphToCart | Transform cartesian to spherical coordinates and vice versa |
RomanToInt | Convert roman numerals to integers |
RgbToLong, LongToRgb | Convert a rgb color to a long number and vice versa |
ColToGray, ColToGrey | Convert colors to gcrey/grayscale |
ColToHex, HexToCol | Convert a color into hex string |
HexToRgb | Convert a hexnumber to an RGB-color |
ColToHsv | R color to HSV conversion |
ColToRgb, RgbToCol | Color to RGB conversion and back |
ConvUnit | Return the most common unit conversions |
Colors: | |
SetAlpha | Add transperancy (alpha channel) to a color. |
ColorLegend | Add a color legend to a plot |
FindColor | Get color on a defined color range |
MixColor | Get the mix of two colors |
TextContrastColor | Choose textcolor depending on background color |
Pal | Some custom color palettes |
Plots (low level): | |
Canvas | Canvas for geometric plotting |
Mar | Set margins more comfortably. |
Asp | Return aspect ratio of the current plot |
LineToUser | Convert line coordinates to user coordinates |
lines.loess | Add a loess smoother and its CIs to an existing plot |
lines.lm | Add the prediction of linear model and its CIs to a plot |
lines.smooth.spline | Add the prediction of a smooth.spline and its CIs to a plot |
BubbleLegend | Add a legend for bubbles to a bubble plot |
TitleRect | Add a main title to a plot surrounded by a rectangular box |
BarText | Add the value labels to a barplot |
ErrBars | Add horizontal or vertical error bars to an existing plot |
DrawArc, DrawRegPolygon | Draw elliptic, circular arc(s) or regular polygon(s) |
DrawCircle, DrawEllipse | Draw a circle, a circle annulus or a sector or an annulus |
DrawBezier | Draw a Bezier curve |
DrawBand | Draw confidence band |
BoxedText | Add text surrounded by a box to a plot |
Rotate | Rotate a geometric structure |
SpreadOut | Spread out a vector of numbers so that there is a minimum |
interval between any two elements. This can be used | |
to place textlabels in a plot so that they do not overlap. | |
IdentifyA | Helps identifying all the points in a specific area. |
identify.formula | Formula interface for identify . |
PtInPoly | Identify all the points within a polygon. |
ConnLines | Calculate and insert connecting lines in a barplot |
AxisBreak | Place a break mark on an axis |
Shade | Produce a shaded curve |
Stamp | Stamp the current plot with Date/Time/Directory or any other expression |
Plots (high level): | |
PlotACF, PlotGACF | Create a combined plot of a time series including its |
autocorrelation and partial autocorrelation | |
PlotMonth | Plot seasonal effects of a univariate time series |
PlotArea | Create an area plot |
PlotBag | Create a two-dimensional boxplot |
PlotBagPairs | Produce pairwise 2-dimensional boxplots (bagplot) |
PlotBubble | Draw a bubble plot |
PlotCandlestick | Plot candlestick chart |
PlotCirc | Create a circular plot |
PlotCorr | Plot a correlation matrix |
PlotDot | Plot a dotchart with confidence intervals |
PlotFaces | Produce a plot of Chernoff faces |
PlotFdist | Frequency distribution plot, combination of histogram, |
boxplot and ecdf.plot | |
PlotMarDens | Scatterplot with marginal densities |
PlotMultiDens | Plot multiple density curves |
PlotPolar | Plot values on a circular grid |
PlotFun | Plot mathematical expression or a function |
PolarGrid | Plot a grid in polar coordinates |
PlotPyramid | Pyramid plot (back-back histogram) |
PlotTreemap | Plot of a treemap. |
PlotVenn | Plot a Venn diagram |
PlotViolin | Plot violins instead of boxplots |
PlotQQ | QQ-plot for an optional distribution |
PlotWeb | Create a web plot |
PlotTernary | Create a triangle or ternary plot |
PlotMiss | Plot missing values |
PlotECDF | Plot empirical cumulative distribution function |
PlotLinesA | Plot the columns of one matrix against the columns of another |
PlotLog | Create a plot with logarithmic axis and log grid |
PlotMosaic | Plots a mosaic describing a contingency table in array form |
Distributions: | |
_Benf | Benford distribution, including qBenf, dBenf, rBenf |
_ExtrVal | Extreme value distribution (dExtrVal) |
_Frechet | Frechet distribution (dFrechet) |
_GenExtrVal | Generalized Extreme Value Distribution (dGenExtrVal) |
_GenPareto | Generalized Pareto Distribution (dGenPareto) |
_Gompertz | Gompertz distribution (dGompertz) |
_Gumbel | Gumbel distribution (dGumbel) |
_NegWeibull | Negative Weibull distribution (dNegWeibull) |
_Order | Distributions of Order Statistics (dOrder) |
_RevGumbel | Reverse Gumbel distribution (dRevGumbel), |
_RevGumbelExp | Expontial reverse Gumbel distribution (quantile only) |
_RevWeibull | Reverse Weibull distribution (dRevWeibull) |
Statistics: | |
Freq | Univariate frequency table |
PercTable | Bivariate percentage table |
Margins | (Extended) margin tables of a table |
ExpFreq | Expected frequencies of a n-dimensional table |
Mode | Mode, the most frequent value (including frequency) |
Gmean, Gsd | Geometric mean and geometric standard deviation |
Hmean | Harmonic Mean |
Median | Extended median function supporting weights and ordered factors |
HuberM, TukeyBiweight | Huber M-estimator of location and Tukey's biweight robust mean |
HodgesLehmann | the Hodges-Lehmann estimator |
HoeffD | Hoeffding's D statistic |
MeanSE | Standard error of mean |
MeanCI, MedianCI | Confidence interval for the mean and median |
MeanDiffCI | Confidence interval for the difference of two means |
MoveAvg | Moving average |
MeanAD | Mean absolute deviation |
VarCI | Confidence interval for the variance |
CoefVar | Coefficient of variation and its confidence interval |
RobScale | Robust data standardization |
Range | (Robust) range |
BinomCI, MultinomCI | Confidence intervals for binomial and multinomial proportions |
BinomDiffCI | Calculate confidence interval for a risk difference |
BinomRatioCI | Calculate confidence interval for the ratio of binomial proportions. |
PoissonCI | Confidence interval for a Poisson lambda |
Skew, Kurt | Skewness and kurtosis |
YuleQ, YuleY | Yule's Q and Yule's Y |
TschuprowT | Tschuprow's T |
Phi, ContCoef, CramerV | Phi, Pearson's Contingency Coefficient and Cramer's V |
GoodmanKruskalGamma | Goodman Kruskal's gamma |
KendallTauA | Kendall's tau-a |
KendallTauB | Kendall's tau-b |
StuartTauC | Stuart's tau-c |
SomersDelta | Somers' delta |
Lambda | Goodman Kruskal's lambda |
GoodmanKruskalTau | Goodman Kruskal's tau |
UncertCoef | Uncertainty coefficient |
Entropy, MutInf | Shannon's entropy, mutual information |
DivCoef, DivCoefMax | Rao's diversity coefficient ("quadratic entropy") |
TheilU | Theil's U1 and U2 coefficient |
Assocs | Combines the association measures above. |
OddsRatio, RelRisk | Odds ratio and relative risk |
ORToRelRisk | Transform odds ratio to relative risk |
CohenKappa, KappaM | Cohen's Kappa, weighted Kappa and Kappa for |
more than 2 raters | |
CronbachAlpha | Cronbach's alpha |
ICC | Intraclass correlations |
KrippAlpha | Return Kripp's alpha coefficient |
KendallW | Compute the Kendall coefficient of concordance |
Lc | Calculate and plot Lorenz curve |
Gini, Atkinson | Gini- and Atkinson coefficient |
Herfindahl, Rosenbluth | Herfindahl- and Rosenbluth coefficient |
GiniSimpson | Compute Gini-Simpson Coefficient |
CorCI | Confidence interval for Pearson's correlation coefficient |
CorPart | Find the correlations for a set x of variables with set y removed |
CorPolychor | Polychoric correlation coefficient |
SpearmanRho | Spearman rank correlation and its confidence intervals |
ConDisPairs | Return concordant and discordant pairs of two vectors |
FindCorr | Determine highly correlated variables |
CohenD | Cohen's Effect Size |
EtaSq | Effect size calculations for ANOVAs |
Contrasts | Generate pairwise contrasts for using in a post-hoc test |
Strata | Stratified sampling with equal/unequal probabilities |
Outlier | Outliers following Tukey's boxplot definition |
LOF | Local outlier factor |
BrierScore | Brier score, assessing the quality of predictions of binary events |
Cstat | C statistic, equivalent to the area under the ROC curve) |
CCC | Lin's concordance correlation coef for agreement on a continuous measure |
MAE | Mean absolute error |
MAPE, SMAPE | Mean absolute and symmetric mean absolute percentage error |
MSE, RMSE | Mean squared error and root mean squared error |
NMAE, NMSE | Normalized mean absolute and mean squared error |
Conf | Confusion matrix, a cross-tabulation of observed and predicted classes |
with associated statistics | |
Sens, Spec | Sensitivity and specificity |
PseudoR2 | Variants of pseudo R squared statistics: McFadden, Aldrich-Nelson, |
Nagelkerke, CoxSnell, Effron, McKelvey-Zavoina, Tjur | |
Mean, SD, Var, IQRw | Variants of base statistics, allowing to define weights: Mean, |
Quantile, MAD, Cor | standard deviation, variance, quantile, mad, correlation |
VIF, StdCoef | Variance inflation factors and standardised coefficents for linear models |
Tests: | |
SignTest | Signtest to test whether two groups are equally sized |
ZTest | Z--test for known population variance |
TTestA | Student's t-test based on sample statistics |
JonckheereTerpstraTest | Jonckheere-Terpstra trend test for medians |
PageTest | Page test for ordered alternatives |
CochranQTest | Cochran's Q-test to find differences in matched sets |
of three or more frequencies or proportions. | |
VarTest | ChiSquare test for one variance and F test for two variances |
SiegelTukeyTest | Siegel-Tukey test for equality in variability |
SiegelTukeyRank | Calculate Siegel-Tukey's ranks (auxiliary function) |
LeveneTest | Levene's test for homogeneity of variance |
MosesTest | Moses Test of extreme reactions |
RunsTest | Runs test for detecting non-randomness |
DurbinWatsonTest | Durbin-Watson test for autocorrelation |
BartelsRankTest | Bartels rank test for randomness |
JarqueBeraTest | Jarque-Bera Test for normality |
AndersonDarlingTest | Anderson-Darling test for normality |
CramerVonMisesTest | Cramer-von Mises test for normality |
LillieTest | Lilliefors (Kolmogorov-Smirnov) test for normality |
PearsonTest | Pearson chi-square test for normality |
ShapiroFranciaTest | Shapiro-Francia test for normality |
MHChisqTest | Mantel-Haenszel Chisquare test |
StuartMaxwellTest | Stuart-Maxwell marginal homogeneity test |
LehmacherTest | Lehmacher marginal homogeneity test |
CochranArmitageTest | Cochran-Armitage test for trend in binomial proportions |
BreslowDayTest, WoolfTest | Test for homogeneity on 2x2xk tables over strata |
PostHocTest | Post hoc tests by Scheffe, LSD, Tukey for a aov-object |
ScheffeTest | Multiple comparisons Scheffe test |
DunnTest | Dunn's test of multiple comparisons |
DunnettTest | Dunnett's test of multiple comparisons |
ConoverTest | Conover's test of multiple comparisons (following a kruskal test) |
NemenyiTest | Nemenyi's test of multiple comparisons |
HotellingsT2Test | Hotelling's T2 test for the one and two sample case |
YuenTTest | Yuen's robust t-Test with trimmed means and winsorized variances |
BarnardTest | Barnard's test for 2x2 tables |
BreuschGodfreyTest | Breusch-Godfrey test for higher-order serial correlation. |
GTest | Chi-squared contingency table test and goodness-of-fit test |
HosmerLemeshowTest | Hosmer-Lemeshow goodness of fit tests |
VonNeumannTest | Von Neumann's successive difference test |
Date functions: | |, | Defined names of the days |
AddMonths, AddMonthsYM | Add a number of months to a given date |
IsDate | Check whether x is a date object |
IsWeekend | Check whether x falls on a weekend |
IsLeapYear | Check whether x is a leap year |
LastDayOfMonth | Return the last day of the month of the date x |
DiffDays360 | Calculate the difference of two dates using the 360-days system |
Date | Create a date from numeric representation of year, month, day |
Day, Month, Year | Extract part of a date |
Hour, Minute, Second | Extract part of time |
Week, Weekday | Returns ISO week and weekday of a date |
Quarter | Quarter of a date |
Timezone | Timezone of a POSIXct/POSIXlt date |
YearDay, YearMonth | The day in the year of a date |
Now, Today | Get current date or date-time |
HmsToSec, SecToHms | Convert h:m:s times to seconds and vice versa |
Overlap | Determine if and how extensively two date ranges overlap |
Zodiac | The zodiac sign of a date :-) |
Finance functions: | |
OPR | One period returns (simple and log returns) |
NPV | Net present value |
NPVFixBond | Net present value for fix bonds |
IRR | Internal rate of return |
YTM | Return yield to maturity for a bond |
SLN, DB, SYD | Several methods of depreciation of an asset |
GUI-Helpers: | |
PasswordDlg | Display a dialog containing an edit field, showing only ***. |
Reporting, InOut: | |
CatTable | Print a table with the option to have controlled linebreaks |
Format, Fmt | Easy format for numbers and dates |
Desc | Produce a rich description of an object |
Abstract | Display compact overview of the structure of a data frame |
TMod | Create comparison table for (general) linear models |
TOne | Create "Table One"" describing baseline characteristics |
GetNewWrd, GetNewXL, GetNewPP | Create a new Word, Excel or PowerPoint Instance |
GetCurrWrd, GetCurrXL, GetCurrPP | Get a handle to a running Word, Excel or PowerPoint instance |
WrdKill, XLKill | Ends a (possibly hidden) Word/Excel process |
IsValidHwnd | Check if the handle to a MS Office application is valid or outdated |
WrdCaption | Insert a title in Word |
WrdFont | Get and set the font for the current selection in Word |
WrdParagraphFormat | Get and set the paragraph format |
WrdTable | Create a table in Word |
WrdCellRange | Select a cell range of a table in Word |
WrdMergeCells | Merge cells of a table in Word |
WrdFormatCells | Format selected cells of a table in word |
WrdTableBorders | Set or edit table border style of a table in Word |
ToWrd, ToXL | Mord flexible wrapper to send diverse objects to Word, resp. Excel |
WrdPlot | Insert the active plot to Word |
WrdInsertBookmark | Insert a new bookmark in a Word document |
WrdGoto | Place cursor to a specific bookmark, or another text position. |
WrdUpdateBookmark | Update the text of a bookmark's range |
WrdSaveAs | Saves documents in Word |
WrdStyle | Get and set the style of a paragraph in Word |
XLDateToPOSIXct | Convert XL-Date format to POSIXct format |
XLGetRange | Get the values of one or several cell range(s) in Excel |
XLGetWorkbook | Get the values of all sheets of an Excel workbook |
XLView | Use Excel as viewer for a data.frame |
PpPlot | Insert active plot to PowerPoint |
PpAddSlide | Adds a slide to a PowerPoint presentation |
PpText | Adds a textbox with text to a PP-presentation |
ParseSASDatalines | Parse a SAS "datalines" statement to read data |
Tools: | |
PairApply | Helper for calculating functions pairwise |
LsFct, LsObj | List the functions (or the data, all objects) of a package |
FctArgs | Retrieve the arguments of a functions |
InDots | Check if an argument is contained in ... argument and return it's value |
ParseFormula | Parse a formula and return the splitted parts of if |
Recycle | Recycle a list of elements to the maximal found dimension |
Keywords | Get the keywords of a man page |
SysInfo | Get some more information about system and environment |
DescToolsOptions | Get the DescTools specific options |
PDFManual | Get the pdf-manual of any package on CRAN and open it |
Data: | | | Synthetic dataset created for testing the description |
d.whisky | of Scotch Single Malts |
Reference Data: | |
d.units, d.prefix | Unit conversion factors and metric prefixes |
d.periodic | Periodic table of elements |
d.countries | ISO 3166-1 country codes |
roulette, cards, tarot | Datasets for probabilistic simulation |
# ******************************************************
# There are no examples defined here. But see the demos:
# demo(describe)
# demo(plots))
# ******************************************************
# }
