There are many ways of reporting how two groups differ. Cohen's d statistic is just the differences of means expressed in terms of the pooled within group standard deviation. This is insensitive to sample size. r is the a universal measure of effect size that is a simple function of d, but is bounded -1 to 1. The t statistic is merely d * sqrt(n)/2 and thus reflects sample size.
$$d = \frac{M2 - M1}{Sp}$$
where Sp is the pooled standard deviation.
$$Sp = \sqrt{\frac{(n1-1)*s1^2 + (n2-1)* s2^2}{N} } $$
Cohens d uses N as the divisor for the pooled sums of squares. Hedges g uses N-2.
Confidence intervals for Cohen's d are found by converting the d to a t, finding the confidence intervals for t, and then converting those back to ds. This take advantage of the uniroot function and the non-centrality parameter of the t distribution.
The results of cohen.d
may be displayed using the error.dots
function. This will include the labels provided in the dictionary.
In the case of finding the confidence interval (using cohen.d.ci
for a comparison against 0 (the one sample case), specify n1. This will yield a d = t/sqrt(n1) whereas in the case of the difference between two samples, d = 2*t/sqrt(n) (for equal sample sizes n = n1+ n2) or d = t/sqrt(1/n1 + 1/n2) for the case of unequal sample sizes.
Since we find d and then convert this to t, using d2t, the question is how to pool the variances. Until 7/14/21 I was using the total n to estimate the t and thus the p values. In response to a query (see news), I switched to using the actual sample size ns (n1 and n2) and then finding t based upon the hedges g value. This produces t values as reported by t.test with the var.equal = TRUE option.
It is probably useful to comment that the various confidence intervals reported are based upon normal theory and should be interpreted cautiously.
cohen.d.by
will find Cohen's d for groups for each subset of the data defined by group2. The summary of the output produces a simplified listing of the d values for each variable for each group. May be called directly from cohen.d by using formula input and specifying two grouping variables.
d.robust
follows Algina et al. 2005) to find trimmed means (trim =.2) and Winsorize variances (trim =.2). Supposedly, this provides a more robust estimate of effect sizes.
m2t
reports Student's t.test for two groups given their means, standard deviations, and sample size. This is convenient when checking statistics where those estimates are provided, but the raw data are not available. By default, it gives the pooled estimate of variance, but if pooled is FALSE, it applies Welch's correction.
The Mahalanobis Distance combines the individual ds and weight them by their unique contribution: \(D = \sqrt{d' R^{-1}d}\).
By default, cohen.d
will find the Mahalanobis distance between the two groups (if there is more than one DV.) This requires finding the correlation of all of the DVs and can fail if that matrix is not invertible because some pairs do not exist. Thus, setting MD=FALSE will prevent the Mahalanobis calculation.
Marco del Giudice (2019) has a very helpful paper discussing how to interpret d and Md in terms of various overlap coefficients. These may be found by the use of the d2OVL
(percent overlap for 1 distribution), d2OVL2
percent overlap of joint distributions, d2CL
(the common language effect size), and d2U3
(proportion in higher group exceeding median of the lower group).
$$OVL = 2\phi(-d/2)$$ is the proportion of overlap (and gets smaller the larger the d).
where Phi is the cumulative density function of the normal distribution.
$$OVL_2 = \frac{OVL}{2-OVL}$$
The proportion of individuals in one group above the median of the other group is U3
$$U_3 = \phi_d$$.
The Common Language Effect size
$$CL = \phi (d *\sqrt{2})$$
These last two get larger with (abs (d)).
For graphic displays of Cohen's d and Mahalanobis D, see the scatterHist
examples, or the example from the psychTools::GERAS data set.