descr: descr (or descriptives)

Description

This function provides a number of descriptives about your data, similar to what SPSS's DESCRIPTIVES (often called with DESCR) does.

Usage

descr(
  x,
  items = names(x),
  varLabels = NULL,
  mean = TRUE,
  meanCI = TRUE,
  median = TRUE,
  mode = TRUE,
  var = TRUE,
  sd = TRUE,
  se = FALSE,
  min = TRUE,
  max = TRUE,
  q1 = FALSE,
  q3 = FALSE,
  IQR = FALSE,
  skewness = TRUE,
  kurtosis = TRUE,
  dip = TRUE,
  totalN = TRUE,
  missingN = TRUE,
  validN = TRUE,
  histogram = FALSE,
  boxplot = FALSE,
  digits = 2,
  errorOnFactor = FALSE,
  convertFactor = FALSE,
  maxModes = 1,
  maxPlotCols = 4,
  t = FALSE,
  headingLevel = 3,
  conf.level = 0.95,
  quantileType = 2
)
rosettaDescr_partial(
  x,
  digits = attr(x, "digits"),
  show = attr(x, "show"),
  headingLevel = attr(x, "headingLevel"),
  maxPlotCols = attr(x, "maxPlotCols"),
  echoPartial = FALSE,
  partialFile = NULL,
  quiet = TRUE,
  ...
)
# S3 method for rosettaDescr
knit_print(
  x,
  digits = attr(x, "digits"),
  show = attr(x, "show"),
  headingLevel = attr(x, "headingLevel"),
  maxPlotCols = attr(x, "maxPlotCols"),
  echoPartial = FALSE,
  partialFile = NULL,
  quiet = TRUE,
  ...
)
# S3 method for rosettaDescr
print(
  x,
  digits = attr(x, "digits"),
  show = attr(x, "show"),
  maxPlotCols = attr(x, "maxPlotCols"),
  headingLevel = attr(x, "headingLevel"),
  forceKnitrOutput = FALSE,
  ...
)

Value

A list of dataframes with the requested values.

Arguments

x: The object to print (i.e. as produced by descr).
items: Optionally, if x is a data frame, the variable names for which to produce the descriptives.
varLabels: Optionally, a named vector with 'pretty labels' to show for the variables. This has to be a vector of the same length as items, and if it is not a named vector with the names corresponding to the items, it has to be in the same order.
mean, meanCI, median, mode: Whether to compute the mean, its confidence interval, the median, and/or the mode (all logical, so TRUE or FALSE).
var, sd, se: Whether to compute the variance, standard deviation, and standard error (all logical, so TRUE or FALSE).
min, max, q1, q3, IQR: Whether to compute the minimum, maximum, first and third quartile, and inter-quartile range (all logical, so TRUE or FALSE).
skewness, kurtosis, dip: Whether to compute the skewness, kurtosis and dip test (all logical, so TRUE or FALSE).
totalN, missingN, validN: Whether to show the total sample size, the number of missing values, and the number of valid (i.e. non-missing) values (all logical, so TRUE or FALSE).
histogram, boxplot: Whether to show a histogram and/or boxplot
digits: The number of digits to round the results to when showing them.
errorOnFactor, convertFactor: If errorOnFactor is TRUE, factors throw an error. If not, if convertFactor is TRUE, they will be converted to numeric values using as.numeric(as.character(x)), and then the same output will be generated as for numeric variables. If convertFactor is false, the frequency table will be produced.
maxModes: Maximum number of modes to display: displays "multi" if more than this number of modes if found.
maxPlotCols: The maximum number of columns when plotting multiple histograms and/or boxplots.
t: Whether to transpose the dataframes when printing them to the screen (this is easier for users relying on screen readers). Note: this functionality has not yet been implemented!
headingLevel: The number of hashes to print in front of the headings when printing while knitting
conf.level: Confidence of confidence interval around the mean in the central tendency measures.
quantileType: The type of quantiles to be used to compute the interquartile range (IQR). See quantile for more information.
show: A vector of elements to show in the results, based on the arguments that activate/deactivate the descriptives (from mean to validN).
echoPartial: Whether to show the executed code in the R Markdown partial (TRUE) or not (FALSE).
partialFile: This can be used to specify a custom partial file. The file will have object x available.
quiet: Passed on to knitr::knit() whether it should b chatty (FALSE) or quiet (TRUE).
...: Any additional arguments are passed to the default print method by the print method, and to rmdpartials::partial() when knitting an RMarkdown partial.
forceKnitrOutput: Force knitr output.

Author

Gjalt-Jorn Peters

Maintainer: Gjalt-Jorn Peters gjalt-jorn@userfriendlyscience.com

Details

Note that R (of course) has many similar functions, such as summary, psych::describe() in the excellent psych::psych package.

The Hartigans' Dip Test may be unfamiliar to users; it is a measure of uni- vs. multimodality, computed by the dip.test() function from the {diptest} package from the. Depending on the sample size, values over .025 can be seen as mildly indicative of multimodality, while values over .05 probably warrant closer inspection (the p-value can be obtained using that dip.test() function from {diptest}; also see Table 1 of Hartigan & Hartigan (1985) for an indication as to critical values).

References

Hartigan, J. A.; Hartigan, P. M. The Dip Test of Unimodality. Ann. Statist. 13 (1985), no. 1, 70--84. doi:10.1214/aos/1176346577. https://projecteuclid.org/euclid.aos/1176346577.

Examples

Run this code

### Simplest example with default settings
descr(mtcars$mpg);

### Also requesting a histogram and boxplot
descr(mtcars$mpg, histogram=TRUE, boxplot=TRUE);

### To show the output as Rmd Partial in the viewer
rosetta::rosettaDescr_partial(
  rosetta::descr(
    mtcars$mpg
  )
);

### Multiple variables, including one factor
rosetta::rosettaDescr_partial(
  rosetta::descr(
    iris
  )
);

Run the code above in your browser using DataLab