meanDiff — meanDiff • rosetta

The meanDiff function compares the means between two groups. It computes Cohen's d, the unbiased estimate of Cohen's d (Hedges' g), and performs a t-test. It also shows the achieved power, and, more usefully, the power to detect small, medium, and large effects.

meanDiff(
  x,
  y = NULL,
  paired = FALSE,
  r.prepost = NULL,
  var.equal = "test",
  conf.level = 0.95,
  plot = FALSE,
  digits = 2,
  envir = parent.frame()
)

# S3 method for meanDiff
print(x, digits = x$digits, powerDigits = x$digits + 2, ...)

# S3 method for meanDiff
pander(x, digits = x$digits, powerDigits = x$digits + 2, ...)

Arguments

x: Dichotomous factor: variable 1; can also be a formula of the form y ~ x, where x must be a factor with two levels (i.e. dichotomous).
y: Numeric vector: variable 2; can be empty if x is a formula.
paired: Boolean; are x & y independent or dependent? Note that if x & y are dependent, they need to have the same length.
r.prepost: Correlation between the pre- and post-test in the case of a paired samples t-test. This is required to compute Cohen's d using the formula on page 29 of Borenstein et al. (2009). If NULL, the correlation is simply computed from the provided scores (but of course it will then be lower if these is an effect - this will lead to an underestimate of the within-groups variance, and therefore, of the standard error of Cohen's d, and therefore, to confidence intervals that are too narrow (too liberal). Also, of course, when using this data to compute the within-groups correlation, random variations will also impact that correlation, which means that confidence intervals may in practice deviate from the null hypothesis significance testing p-value in either direction (i.e. the p-value may indicate a significant association while the confidence interval contains 0, or the other way around). Therefore, if the test-retest correlation of the relevant measure is known, please provide this here to enable computation of accurate confidence intervals.
var.equal: String; only relevant if x & y are independent; can be "test" (default; test whether x & y have different variances), "no" (assume x & y have different variances; see the Warning below!), or "yes" (assume x & y have the same variance)
conf.level: Confidence of confidence intervals you want.
plot: Whether to print a dlvPlot.
digits: With what precision you want the results to print.
envir: The environment where to search for the variables (useful when calling meanDiff from a function where the vectors are defined in that functions environment).
powerDigits: With what precision you want the power to print.
...: Additional arguments are passen on to the ggplot2::ggplot() print method.

Value

An object is returned with the following elements:

variables: Input variables
groups: Levels of the x variable, the dichotomous factor
ci.confidence: Confidence of confidence intervals
digits: Number of digits for output
x: Values of dependent variable in first group
y: Values of dependent variable in second group
type: Type of t-test (independent or dependent, equal variances or not)
n: Sample sizes of the two groups
mean: Means of the two groups
sd: Standard deviations of the two groups
objects: Objects used; the t-test and optionally the test for equal variances
variance: Variance of the difference score
meanDiff: Difference between the means
meanDiff.d: Cohen's d
meanDiff.d.var: Variance of Cohen's d
meanDiff.d.se: Standard error of Cohen's d
meanDiff.J: Correction for Cohen's d to get to the unbiased Hedges g
power: Achieved power with current effect size and sample size
power.small: Power to detect small effects with current sample size
power.medium: Power to detect medium effects with current sample size
power.largel: Power to detect large effects with current sample size
meanDiff.g: Hedges' g
meanDiff.g.var: Variance of Hedges' g
meanDiff.g.se: Standard error of Hedges' g
ci.usedZ: Z value used to compute confidence intervals
meanDiff.d.ci.lower: Lower bound of confidence interval around Cohen's d
meanDiff.d.ci.upper: Upper bound of confidence interval around Cohen's d
meanDiff.g.ci.lower: Lower bound of confidence interval around Hedges' g
meanDiff.g.ci.upper: Upper bound of confidence interval around Hedges' g
meanDiff.ci.lower: Lower bound of confidence interval around raw mean
meanDiff.ci.upper: Upper bound of confidence interval around raw mean
t: Student t value for Null Hypothesis Significance Testing
df: Degrees of freedom for t value
p: p-value corresponding to t value

Details

This function uses the formulae from Borenstein, Hedges, Higgins & Rothstein (2009) (pages 25-32).

Warning

Note that when different variances are assumed for the t-test (i.e. the null-hypothesis test), the values of Cohen's d are still based on the assumption that the variance is equal. In this case, the confidence interval might, for example, not contain zero even though the NHST has a non-significant p-value (the reverse can probably happen, too).

References

Borenstein, M., Hedges, L. V., Higgins, J. P., & Rothstein, H. R. (2011). Introduction to meta-analysis. John Wiley & Sons.

Examples


### Create simple dataset
dat <- PlantGrowth[1:20,];
### Remove third level from group factor
dat$group <- factor(dat$group);
### Compute mean difference and show it
meanDiff(dat$weight ~ dat$group);
#> Input variables:
#> 
#>   group (grouping variable)
#>   weight (dependent variable)
#>   Mean 1 (ctrl) = 5.03, sd = 0.58, n = 10
#>   Mean 2 (trt1)= 4.66, sd = 0.79, n = 10
#> 
#> Independent samples t-test (tested for equal variances, p = .372, so equal variances)
#>   (pooled standard deviation used, 0.7)
#> 
#> 95% confidence intervals:
#>   Absolute mean difference: [-0.28, 1.03] (Absolute mean difference: 0.37)
#>   Cohen's d for difference: [-0.36, 1.42] (Cohen's d point estimate: 0.53)
#>   Hedges g for difference:  [-0.34, 1.36] (Hedges g point estimate:  0.51)
#> 
#> Achieved power for d=0.53: 0.2038 (for small: 0.0708; medium: 0.1851; large: 0.3951)
#> 
#> (secondary information (NHST): t[18] = 1.19, p = .249)

### Look at second treatment
dat <- rbind(PlantGrowth[1:10,], PlantGrowth[21:30,]);
### Remove third level from group factor
dat$group <- factor(dat$group);
### Compute mean difference and show it
meanDiff(x=dat$group, y=dat$weight);
#> Input variables:
#> 
#>   group (grouping variable)
#>   weight (dependent variable)
#>   Mean 1 (ctrl) = 5.03, sd = 0.58, n = 10
#>   Mean 2 (trt2)= 5.53, sd = 0.44, n = 10
#> 
#> Independent samples t-test (tested for equal variances, p = .424, so equal variances)
#>   (pooled standard deviation used, 0.52)
#> 
#> 95% confidence intervals:
#>   Absolute mean difference: [-0.98, -0.01] (Absolute mean difference: -0.49)
#>   Cohen's d for difference: [-1.88, -0.03] (Cohen's d point estimate: -0.95)
#>   Hedges g for difference:  [-1.8, -0.03] (Hedges g point estimate:  -0.91)
#> 
#> Achieved power for d=-0.95: 0.5238 (for small: 0.0708; medium: 0.1851; large: 0.3951)
#> 
#> (secondary information (NHST): t[18] = -2.13, p = .047)