This function performs a Student's t test for two independent samples, for paired samples, or for one sample. It's a parametric test for the null hypothesis that the means of two independent samples are equal, or that the mean of one sample is equal to a specified value. The hypothesis can be one- or two-sided.

Unlike the underlying base R function t.test(), this function allows for weighted tests and automatically calculates effect sizes. Cohen's d is returned for larger samples (n > 20), while Hedges' g is returned for smaller samples.

t_test(
  data,
  select = NULL,
  by = NULL,
  weights = NULL,
  paired = FALSE,
  mu = 0,
  alternative = "two.sided"
)

Arguments

data

A data frame.

select

Name(s) of the continuous variable(s) (as character vector) to be used as samples for the test. select can be one of the following:

  • select can be used in combination with by, in which case select is the name of the continous variable (and by indicates a grouping factor).

  • select can also be a character vector of length two or more (more than two names only apply to kruskal_wallis_test()), in which case the two continuous variables are treated as samples to be compared. by must be NULL in this case.

  • If select select is of length two and paired = TRUE, the two samples are considered as dependent and a paired test is carried out.

  • If select specifies one variable and by = NULL, a one-sample test is carried out (only applicable for t_test() and wilcoxon_test())

  • For chi_squared_test(), if select specifies one variable and both by and probabilities are NULL, a one-sample test against given probabilities is automatically conducted, with equal probabilities for each level of select.

by

Name of the variable indicating the groups. Required if select specifies only one variable that contains all samples to be compared in the test. If by is not a factor, it will be coerced to a factor. For chi_squared_test(), if probabilities is provided, by must be NULL.

weights

Name of an (optional) weighting variable to be used for the test.

paired

Logical, whether to compute a paired t-test for dependent samples.

mu

The hypothesized difference in means (for t_test()) or location shift (for wilcoxon_test() and mann_whitney_test()). The default is 0.

alternative

A character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". See ?t.test and ?wilcox.test.

Value

A data frame with test results. Effectsize Cohen's d is returned for larger samples (n > 20), while Hedges' g is returned for smaller samples.

Details

Interpretation of effect sizes are based on rules described in effectsize::interpret_cohens_d() and effectsize::interpret_hedges_g(). Use these function directly to get other interpretations, by providing the returned effect size (Cohen's d or Hedges's g in this case) as argument, e.g. interpret_cohens_d(0.35, rules = "sawilowsky2009").

Which test to use

The following table provides an overview of which test to use for different types of data. The choice of test depends on the scale of the outcome variable and the number of samples to compare.

SamplesScale of OutcomeSignificance Test
1binary / nominalchi_squared_test()
1continuous, not normalwilcoxon_test()
1continuous, normalt_test()
2, independentbinary / nominalchi_squared_test()
2, independentcontinuous, not normalmann_whitney_test()
2, independentcontinuous, normalt_test()
2, dependentbinary (only 2x2)chi_squared_test(paired=TRUE)
2, dependentcontinuous, not normalwilcoxon_test()
2, dependentcontinuous, normalt_test(paired=TRUE)
>2, independentcontinuous, not normalkruskal_wallis_test()
>2, independentcontinuous, normaldatawizard::means_by_group()
>2, dependentcontinuous, not normalnot yet implemented (1)
>2, dependentcontinuous, normalnot yet implemented (2)

(1) More than two dependent samples are considered as repeated measurements. For ordinal or not-normally distributed outcomes, these samples are usually tested using a friedman.test(), which requires the samples in one variable, the groups to compare in another variable, and a third variable indicating the repeated measurements (subject IDs).

(2) More than two dependent samples are considered as repeated measurements. For normally distributed outcomes, these samples are usually tested using a ANOVA for repeated measurements. A more sophisticated approach would be using a linear mixed model.

References

  • Bender, R., Lange, S., Ziegler, A. Wichtige Signifikanztests. Dtsch Med Wochenschr 2007; 132: e24–e25

  • du Prel, J.B., Röhrig, B., Hommel, G., Blettner, M. Auswahl statistischer Testverfahren. Dtsch Arztebl Int 2010; 107(19): 343–8

See also

  • t_test() for parametric t-tests of dependent and independent samples.

  • mann_whitney_test() for non-parametric tests of unpaired (independent) samples.

  • wilcoxon_test() for Wilcoxon rank sum tests for non-parametric tests of paired (dependent) samples.

  • kruskal_wallis_test() for non-parametric tests with more than two independent samples.

  • chi_squared_test() for chi-squared tests (two categorical variables, dependent and independent).

Examples

data(sleep)
# one-sample t-test
t_test(sleep, "extra")
#> # One Sample t-test
#> 
#>   Data: extra
#>   Group 1: extra (mean = 1.54)
#>   Alternative hypothesis: true mean is not equal to 0
#> 
#>   t = 3.41, Hedges' g = 0.73 (medium effect), df = 19, p = 0.003
#> 
# base R equivalent
t.test(extra ~ 1, data = sleep)
#> 
#> 	One Sample t-test
#> 
#> data:  extra
#> t = 3.413, df = 19, p-value = 0.002918
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#>  0.5955845 2.4844155
#> sample estimates:
#> mean of x 
#>      1.54 
#> 

# two-sample t-test, by group
t_test(mtcars, "mpg", by = "am")
#> # Welch Two Sample t-test
#> 
#>   Data: mpg by am
#>   Group 1: 0 (n = 19, mean = 17.15)
#>   Group 2: 1 (n = 13, mean = 24.39)
#>   Alternative hypothesis: true difference in means is not equal to 0
#> 
#>   t = -3.77, Cohen's d = -1.48 (large effect), df = 18.3, p = 0.001
#> 
# base R equivalent
t.test(mpg ~ am, data = mtcars)
#> 
#> 	Welch Two Sample t-test
#> 
#> data:  mpg by am
#> t = -3.7671, df = 18.332, p-value = 0.001374
#> alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
#> 95 percent confidence interval:
#>  -11.280194  -3.209684
#> sample estimates:
#> mean in group 0 mean in group 1 
#>        17.14737        24.39231 
#> 

# paired t-test
t_test(mtcars, c("mpg", "hp"), paired = TRUE)
#> # Paired t-test
#> 
#>   Data: mpg and hp (mean difference = -126.60)
#>   Alternative hypothesis: true mean is not equal to 0
#> 
#>   t = -9.76, Cohen's d = -1.73 (large effect), df = 31, p < .001
#> 
# base R equivalent
t.test(mtcars$mpg, mtcars$hp, data = mtcars, paired = TRUE)
#> 
#> 	Paired t-test
#> 
#> data:  mtcars$mpg and mtcars$hp
#> t = -9.7647, df = 31, p-value = 5.641e-11
#> alternative hypothesis: true mean difference is not equal to 0
#> 95 percent confidence interval:
#>  -153.0385 -100.1552
#> sample estimates:
#> mean difference 
#>       -126.5969 
#>