Significance Testing Of Differences Between Predictions III: Contrasts And Comparisons For Generalized Linear Models
Source:vignettes/introduction_comparisons_3.Rmd
introduction_comparisons_3.Rmd
This vignette is the third in a 4-part series:
Significance Testing of Differences Between Predictions I: Contrasts and Pairwise Comparisons
Significance Testing of Differences Between Predictions III: Contrasts and Comparisons for Generalized Linear Models
Contrasts and comparisons for GLM - logistic regression example
We will now show an example for non-Gaussian models. For GLM’s
(generalized linear models) with (non-Gaussian) link-functions,
predict_response()
always returns predcted values on the
response scale. For example, predicted values for logistic
regression models are shown as probabilities.
Summary of most important points:
-
Predictions (returned by
predict_response()
) are usually on the response scale. This is also true for other regression models than linear regression. E.g., predictions for logistic regression are presented as probailities, and for Poisson regression, the average count of event is returned. -
test_predictions()
also returns contrasts and comparisons on the response scale by default. This is usually the most intuitive scale for people to understand. E.g., for a logistic regression model, contrasts are presented as difference between two probabilities (in percentage points). - It is possible to return contrasts or comparisons on other scales, too - but mostly, this is probably not necessary.
Let’s look at a simple example
library(ggeffects)
set.seed(1234)
dat <- data.frame(
outcome = rbinom(n = 100, size = 1, prob = 0.35),
x1 = as.factor(sample(1:3, size = 100, TRUE, prob = c(0.5, 0.2, 0.3))),
x2 = rnorm(n = 100, mean = 10, sd = 7),
x3 = as.factor(sample(1:4, size = 100, TRUE, prob = c(0.1, 0.4, 0.2, 0.3)))
)
m <- glm(outcome ~ x1 + x2 + x3, data = dat, family = binomial())
predict_response(m, "x1")
#> # Predicted probabilities of outcome
#>
#> x1 | Predicted | 95% CI
#> ---------------------------
#> 1 | 0.15 | 0.03, 0.49
#> 2 | 0.09 | 0.02, 0.40
#> 3 | 0.22 | 0.05, 0.63
#>
#> Adjusted for:
#> * x2 = 10.29
#> * x3 = 1
Contrasts and comparisons for categorical focal terms
Contrasts or comparisons - like predictions (see above) - are by default on the response scale, i.e. they’re represented as difference between probabilities (in percentage points).
p <- predict_response(m, "x1")
test_predictions(p)
#> # Pairwise comparisons
#>
#> x1 | Contrast | 95% CI | p
#> ------------------------------------
#> 1-2 | 0.05 | -0.09, 0.19 | 0.469
#> 1-3 | -0.07 | -0.25, 0.10 | 0.414
#> 2-3 | -0.13 | -0.35, 0.09 | 0.257
#>
#> Contrasts are presented as probabilities (in %-points).
The difference between the predicted probability of
x1 = 1
(14.6%) and x1 = 2
(9.3%) is roughly
5.3% points. This difference is not statistically significant (p =
0.469).
The scale
argument in test_predictions()
can be used to return contrasts or comparisons on a differen scale. For
example, to transform contrasts to odds ratios, we can use
scale = "exp"
.
test_predictions(p, scale = "exp")
#> # Pairwise comparisons
#>
#> x1 | Contrast | 95% CI | p
#> -----------------------------------
#> 1-2 | 1.05 | 0.91, 1.22 | 0.469
#> 1-3 | 0.93 | 0.78, 1.11 | 0.414
#> 2-3 | 0.88 | 0.71, 1.10 | 0.257
#>
#> Contrasts are presented on the exponentiated scale.
Contrasts or comparisons can also be represented on the link-scale,
in this case as log-odds. To do so, use
scale = "link"
.
test_predictions(p, scale = "link")
#> # Pairwise comparisons
#>
#> x1 | Contrast | 95% CI | p
#> ------------------------------------
#> 1-2 | 0.51 | -0.79, 1.80 | 0.443
#> 1-3 | -0.50 | -1.55, 0.54 | 0.345
#> 2-3 | -1.01 | -2.38, 0.36 | 0.147
#>
#> Contrasts are presented as log-odds.
Contrasts and comparisons for numerical focal terms
For numeric focal variables, where the slopes (linear trends) are
estimated, transformed scales (like scale = "exp"
) are not
supported. However, scale = "link"
can be used to return
untransformed contrasts or comparisons on the link-scale.
test_predictions(m, "x2", scale = "link")
#> # (Average) Linear trend for x2
#>
#> Slope | 95% CI | p
#> ---------------------------
#> -0.07 | -0.14, 0.00 | 0.065
#>
#> Slopes are presented as log-odds.
Be aware whether and which back-transformation to use, as it affects the resulting p-values. A detailed overview of transformations can be found in this vignette.
Contrasts and comparisons for different margin
options
Like in predict_response()
, the margin
argument can be used in test_predictions()
to define how to
marginalize over the non-focal predictors, i.e. those variables
that are not specified in terms
. This can be
important depending on the type of regression models in order to
calculate accurate comparisons or contrasts, since these refer to the
difference between predicted values.
For linear models, these differences are usually the same, regardless
of the margin
option. However, for non-Gaussian models,
differences between predicted values may differ for the different
margin
options.
# predictions, using mean/mode for non-focal predictors
p1 <- predict_response(m, "x1")
# predictions, averaged across non-focal predictors
p2 <- predict_response(m, "x1", margin = "empirical")
p1
#> # Predicted probabilities of outcome
#>
#> x1 | Predicted | 95% CI
#> ---------------------------
#> 1 | 0.15 | 0.03, 0.49
#> 2 | 0.09 | 0.02, 0.40
#> 3 | 0.22 | 0.05, 0.63
#>
#> Adjusted for:
#> * x2 = 10.29
#> * x3 = 1
p2
#> # Average predicted probabilities of outcome
#>
#> x1 | Predicted | 95% CI
#> ---------------------------
#> 1 | 0.24 | 0.13, 0.38
#> 2 | 0.16 | 0.06, 0.36
#> 3 | 0.34 | 0.18, 0.53
# differences between predicted values
diff(p1$predicted)
#> [1] -0.05258416 0.12700886
diff(p2$predicted)
#> [1] -0.07906904 0.18124204
Consequently, test_predictions()
either
requires specifying the margin
argument when a model and
terms
argument are provided, or the related
ggeffects
object returned by
predict_response()
.
# contrast refers to predictions, using mean/mode for non-focal predictors
test_predictions(m, "x1")
#> # Pairwise comparisons
#>
#> x1 | Contrast | 95% CI | p
#> ------------------------------------
#> 1-2 | 0.05 | -0.09, 0.19 | 0.469
#> 1-3 | -0.07 | -0.25, 0.10 | 0.414
#> 2-3 | -0.13 | -0.35, 0.09 | 0.257
#>
#> Contrasts are presented as probabilities (in %-points).
# contrast refers to predictions, averaged across non-focal predictors
test_predictions(m, "x1", margin = "empirical")
#> # Pairwise comparisons
#>
#> x1 | Contrast | 95% CI | p
#> ------------------------------------
#> 1-2 | 0.08 | -0.11, 0.27 | 0.417
#> 1-3 | -0.10 | -0.31, 0.11 | 0.353
#> 2-3 | -0.18 | -0.41, 0.05 | 0.125
#>
#> Contrasts are presented as probabilities (in %-points).
# or
test_predictions(p2)
#> # Pairwise comparisons
#>
#> x1 | Contrast | 95% CI | p
#> ------------------------------------
#> 1-2 | 0.08 | -0.11, 0.27 | 0.417
#> 1-3 | -0.10 | -0.31, 0.11 | 0.353
#> 2-3 | -0.18 | -0.41, 0.05 | 0.125
#>
#> Contrasts are presented as probabilities (in %-points).