This function provides a number of descriptives about your data, similar to what SPSS's DESCRIPTIVES (often called with DESCR) does.
descr(
x,
items = names(x),
varLabels = NULL,
mean = TRUE,
meanCI = TRUE,
median = TRUE,
mode = TRUE,
var = TRUE,
sd = TRUE,
se = FALSE,
min = TRUE,
max = TRUE,
q1 = FALSE,
q3 = FALSE,
IQR = FALSE,
skewness = TRUE,
kurtosis = TRUE,
dip = TRUE,
totalN = TRUE,
missingN = TRUE,
validN = TRUE,
histogram = FALSE,
boxplot = FALSE,
digits = 2,
errorOnFactor = FALSE,
convertFactor = FALSE,
maxModes = 1,
maxPlotCols = 4,
t = FALSE,
headingLevel = 3,
conf.level = 0.95,
quantileType = 2
)rosettaDescr_partial(
x,
digits = attr(x, "digits"),
show = attr(x, "show"),
headingLevel = attr(x, "headingLevel"),
maxPlotCols = attr(x, "maxPlotCols"),
echoPartial = FALSE,
partialFile = NULL,
quiet = TRUE,
...
)
# S3 method for rosettaDescr
knit_print(
x,
digits = attr(x, "digits"),
show = attr(x, "show"),
headingLevel = attr(x, "headingLevel"),
maxPlotCols = attr(x, "maxPlotCols"),
echoPartial = FALSE,
partialFile = NULL,
quiet = TRUE,
...
)
# S3 method for rosettaDescr
print(
x,
digits = attr(x, "digits"),
show = attr(x, "show"),
maxPlotCols = attr(x, "maxPlotCols"),
headingLevel = attr(x, "headingLevel"),
forceKnitrOutput = FALSE,
...
)
A list of dataframes with the requested values.
The object to print (i.e. as produced by descr
).
Optionally, if x
is a data frame, the variable names for
which to produce the descriptives.
Optionally, a named vector with 'pretty labels' to show
for the variables. This has to be a vector of the same length as items
,
and if it is not a named vector with the names corresponding to the
items
, it has to be in the same order.
Whether to compute the mean, its
confidence interval, the median, and/or the mode (all logical, so TRUE
or FALSE
).
Whether to compute the variance, standard deviation, and
standard error (all logical, so TRUE
or FALSE
).
Whether to compute the minimum, maximum, first and
third quartile, and inter-quartile range (all logical, so TRUE
or FALSE
).
Whether to compute the skewness, kurtosis and
dip test (all logical, so TRUE
or FALSE
).
Whether to show the total sample size, the
number of missing values, and the number of valid (i.e. non-missing) values
(all logical, so TRUE
or FALSE
).
Whether to show a histogram and/or boxplot
The number of digits to round the results to when showing them.
If errorOnFactor
is TRUE
, factors
throw an error. If not, if convertFactor
is TRUE, they will be
converted to numeric values using as.numeric(as.character(x))
, and then
the same output will be generated as for numeric variables. If
convertFactor
is false, the frequency table will be produced.
Maximum number of modes to display: displays "multi" if more than this number of modes if found.
The maximum number of columns when plotting multiple histograms and/or boxplots.
Whether to transpose the dataframes when printing them to the screen (this is easier for users relying on screen readers). Note: this functionality has not yet been implemented!
The number of hashes to print in front of the headings when printing while knitting
Confidence of confidence interval around the mean in the central tendency measures.
The type of quantiles to be used to compute the
interquartile range (IQR). See quantile
for more information.
A vector of elements to show in the results, based on the
arguments that activate/deactivate the descriptives (from mean
to
validN
).
Whether to show the executed code in the R Markdown
partial (TRUE
) or not (FALSE
).
This can be used to specify a custom partial file. The
file will have object x
available.
Passed on to knitr::knit()
whether it should b
chatty (FALSE
) or quiet (TRUE
).
Any additional arguments are passed to the default print method
by the print method, and to rmdpartials::partial()
when knitting an
RMarkdown partial.
Force knitr output.
Gjalt-Jorn Peters
Maintainer: Gjalt-Jorn Peters gjalt-jorn@userfriendlyscience.com
Note that R (of course) has many similar functions, such as
summary
, psych::describe()
in the excellent
psych::psych package.
The Hartigans' Dip Test may be unfamiliar to users; it is a measure of uni-
vs. multimodality, computed by the dip.test()
function from the
{diptest}
package from the. Depending on the sample size, values over
.025 can be seen as mildly indicative of multimodality, while values over
.05 probably warrant closer inspection (the p-value can be obtained using
that dip.test()
function from {diptest}
; also see Table 1 of
Hartigan & Hartigan (1985) for an indication as to critical values).
Hartigan, J. A.; Hartigan, P. M. The Dip Test of Unimodality. Ann. Statist. 13 (1985), no. 1, 70--84. doi:10.1214/aos/1176346577. https://projecteuclid.org/euclid.aos/1176346577.
summary
, [psych::describe()
### Simplest example with default settings
descr(mtcars$mpg);
### Also requesting a histogram and boxplot
descr(mtcars$mpg, histogram=TRUE, boxplot=TRUE);
### To show the output as Rmd Partial in the viewer
rosetta::rosettaDescr_partial(
rosetta::descr(
mtcars$mpg
)
);
### Multiple variables, including one factor
rosetta::rosettaDescr_partial(
rosetta::descr(
iris
)
);
Run the code above in your browser using DataLab