skewness: Univariate and Multivariate Skewness and Kurtosis

Description

The function skewness computes the univariate sample or population skewness and conduct's Mardia's test for multivariate skewness, while the function kurtosis computes the univariate sample or population (excess) kurtosis or the multivariate (excess) kurtosis and conduct's Mardia's test for multivariate kurtosis. By default, the function computes the sample univariate skewness or multivariate skewness and the univariate sample excess kurtosis or multivariate excess kurtosis.

Usage

skewness(data, ..., sample = TRUE, digits = 2, p.digits,
         as.na = NULL, check = TRUE, output = TRUE)
kurtosis(data, ..., sample = TRUE, center = TRUE, digits = 2, p.digits,
         as.na = NULL, check = TRUE, output = TRUE)

Value

Returns univariate skewness or kurtosis of data or an object of class misty.object, which is a list with following entries:

call: function call
type: type of analysis
data: a numeric vector or data frame specified in data
args: specification of function arguments
result: result table

Arguments

data: a numeric vector or data frame.
...: an expression indicating the variable names in data, e.g., skewness(dat, x1). Note that the operators ., +, -, ~, :, ::, and ! can also be used to select variables, see 'Details' in the df.subset function.
sample: logical: if TRUE (default), the univariate sample skewness or kurtosis is computed, while the population skewness or kurtosis is computed when sample = FALSE.
center: logical: if TRUE (default), the univariate or multivariate kurtosis is centered, so that the expected kurtosis under univariate or multivariate normality is 0, while the expected kurtosis under univariate or multivariate normality is 3 when center = FALSE.
digits: an integer value indicating the number of decimal places to be used. Note that this argument only applied when computing multivariate skewness and kurtosis.
p.digits: an integer value indicating the number of decimal places to be used for displaying the p-values.
as.na: a numeric vector indicating user-defined missing values, i.e., these values are converted to NA before conducting the analysis.
check: logical: if TRUE (default), argument specification is checked.
output: logical: if TRUE (default), output is shown on the console. Note that this argument only applied when computing multivariate skewness and kurtosis.

Author

Takuya Yanagida takuya.yanagida@univie.ac.at

Details

Univariate Skewness and Kurtosis

Univariate skewness and kurtosis are computed based on the same formula as in SAS and SPSS:

Population Skewness $$\sqrt{n}\frac{\sum_{i=1}^{n}(X_i - \bar{X})^3}{(\sum_{i=1}^{n}(X_i - \bar{X})^2)^{3/2}}$$
Sample Skewness $$\frac{n\sqrt{n - 1}}{n-2} \frac{\sum_{i=1}^{n}(X_i - \bar{X})^3}{(\sum_{i=1}^{n}(X_i - \bar{X})^2)^{3/2}}$$
Population Excess Kurtosis $$n\frac{\sum_{i=1}^{n}(X_i - \bar{X})^4}{(\sum_{i=1}^{n}(X_i - \bar{X})^2)^2} - 3$$
Sample Excess Kurtosis $$(n + 1)\frac{\sum_{i=1}^{n}(X_i - \bar{X})^4}{(\sum_{i=1}^{n}(X_i - \bar{X})^2)^2} - 3 + 6\frac{n - 1}{(n - 2)(n - 3)}$$

Note that missing values (NA) are stripped before the computation and that at least 3 observations are needed to compute skewness and at least 4 observations are needed to compute kurtosis.

Multivariate Skewness and Kurtosis

Mardia's multivariate skewness and kurtosis compares the joint distribution of several variables against a multivariate normal distribution. The expected skewness is 0 for a multivariate normal distribution, while the expected kurtosis is $p(p + 2)$ for a multivariate distribution of $p$ variables. However, this function scales the multivariate kurtosis on $p(p + 2)$ according to the default setting center = TRUE so that the expected kurtosis under multivariate normality is 0. Multivariate skewness and kurtosis are tested for statistical significance based on the chi-square distribution for skewness and standard normal distribution for the kurtosis. If at least one of the tests is statistically significant, the underlying joint population is inferred to be non-normal. Note that non-significance of these statistical tests do not imply multivariate normality.

References

Cain, M. K., Zhang, Z., & Yuan, KH. (2024). Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation. Behavior Research Methods, 49, 1716–1735. https://doi.org/10.3758/s13428-016-0814-1

Mardia, K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3), 519-530. https://doi.org/10.2307/2334770

Rasch, D., Kubinger, K. D., & Yanagida, T. (2011). Statistics in psychology - Using R and SPSS. John Wiley & Sons.

William Revelle (2024). psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University, Evanston, Illinois. R package version 2.4.6, https://CRAN.R-project.org/package=psych.

Examples

Run this code

# Example 1a: Compute univariate sample skewness
skewness(mtcars, mpg)

# Example 1b: Compute univariate sample excess kurtosis
kurtosis(mtcars, mpg)

# Example 2a: Compute multivariate skewness
skewness(mtcars)

# Example 2b: Compute multivariate excess kurtosis
kurtosis(mtcars)

Run the code above in your browser using DataLab