Compute Goodness-of-fit measures for various regression models, including mixed and Bayesian regression models.
cod(x) r2(x, ...) # S3 method for lme r2(x, n = NULL, ...) # S3 method for stanreg r2(x, loo = FALSE, ...) # S3 method for brmsfit r2(x, loo = FALSE, ...)
Fitted model of class
Currently not used.
r2(), depending on the model, returns:
For linear models, the r-squared and adjusted r-squared values.
For mixed models, the marginal and conditional r-squared values.
glm objects, Cox & Snell's and Nagelkerke's pseudo r-squared values.
stanreg objects, the Bayesian version of r-squared is computed, calling
loo = TRUE, for
stanreg objects a LOO-adjusted version of r-squared is returned.
Models that are not currently supported return
cod(), returns the
D Coefficient of Discrimination,
also known as Tjur's R-squared value.
For linear models, the r-squared and adjusted r-squared value is returned,
as provided by the
For mixed models (from lme4 or glmmTMB) marginal and conditional r-squared values are calculated, based on Nakagawa et al. 2017. The distributional variance (or observation-level variance) is based on lognormal approximation,
lme-models, an r-squared approximation by computing the
correlation between the fitted and observed values, as suggested by
Byrnes (2008), is returned as well as a simplified version of
the Omega-squared value (1 - (residual variance / response variance),
Xu (2003), Nakagawa, Schielzeth 2013), unless
n is given, for
lme-models pseudo r-squared measures based
on the variances of random intercept (tau 00, between-group-variance)
and random slope (tau 11, random-slope-variance), as well as the
r-squared statistics as proposed by Snijders and Bosker 2012 and
the Omega-squared value (1 - (residual variance full model / residual
variance null model)) as suggested by Xu (2003) are returned.
For generalized linear models, Cox & Snell's and Nagelkerke's pseudo r-squared values are returned.
The ("unadjusted") r-squared value and its standard error for
stanreg objects are robust measures, i.e.
the median is used to compute r-squared, and the median absolute
deviation as the measure of variability. If
loo = TRUE,
a LOO-adjusted r-squared is calculated, which comes conceptionally
closer to an adjusted r-squared measure.
This method calculates the Coefficient of Discrimination
for generalized linear (mixed) models for binary data. It is
an alternative to other Pseudo-R-squared values like Nakelkerke's
R2 or Cox-Snell R2. The Coefficient of Discrimination
can be read like any other (Pseudo-)R-squared value.
For mixed models, the marginal r-squared considers only the variance
of the fixed effects, while the conditional r-squared takes both
the fixed and random effects into account.
n is given, the Pseudo-R2 statistic
is the proportion of explained variance in the random effect after
adding co-variates or predictors to the model, or in short: the
proportion of the explained variance in the random effect of the
full (conditional) model
x compared to the null (unconditional)
The Omega-squared statistics, if
n is given, is 1 - the proportion
of the residual variance of the full model compared to the null model's
residual variance, or in short: the the proportion of the residual
variation explained by the covariates.
Alternative ways to assess the "goodness-of-fit" is to compare the ICC of the null model with the ICC of the full model (see
Bolker B et al. (2017): GLMM FAQ
Byrnes, J. 2008. Re: Coefficient of determination (R^2) when using lme() (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q2/000713.html)
Kwok OM, Underhill AT, Berry JW, Luo W, Elliott TR, Yoon M. 2008. Analyzing Longitudinal Data with Multilevel Models: An Example with Individuals Living with Lower Extremity Intra-Articular Fractures. Rehabilitation Psychology 53(3): 370-86. doi: 10.1037/a0012765
Nakagawa S, Schielzeth H. 2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution, 4(2):133-142. doi: 10.1111/j.2041-210x.2012.00261.x
Nakagawa S, Johnson P, Schielzeth H (2017) The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisted and expanded. J. R. Soc. Interface 14. doi: 10.1098/rsif.2017.0213
Rabe-Hesketh S, Skrondal A. 2012. Multilevel and longitudinal modeling using Stata. 3rd ed. College Station, Tex: Stata Press Publication
Raudenbush SW, Bryk AS. 2002. Hierarchical linear models: applications and data analysis methods. 2nd ed. Thousand Oaks: Sage Publications
Snijders TAB, Bosker RJ. 2012. Multilevel analysis: an introduction to basic and advanced multilevel modeling. 2nd ed. Los Angeles: Sage
Xu, R. 2003. Measuring explained variation in linear mixed effects models. Statist. Med. 22:3527-3541. doi: 10.1002/sim.1572
Tjur T. 2009. Coefficients of determination in logistic regression models - a new proposal: The coefficient of discrimination. The American Statistician, 63(4): 366-372
data(efc) # Tjur's R-squared value efc$services <- ifelse(efc$tot_sc_e > 0, 1, 0) fit <- glm(services ~ neg_c_7 + c161sex + e42dep, data = efc, family = binomial(link = "logit")) cod(fit)#> #> R-Squared for (Generalized) Linear (Mixed) Model #> #> Tjur's D: 0.023 #>library(lme4)#>fit <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy) r2(fit)#> #> R-Squared for (Generalized) Linear (Mixed) Model #> #> Family : gaussian (identity) #> Formula: ~Days | Subject Reaction ~ Days NA #> #> Marginal R2: 0.279 #> Conditional R2: 0.799 #>fit <- lm(barthtot ~ c160age + c12hour, data = efc) r2(fit)#> #> R-Squared for (Generalized) Linear (Mixed) Model #> #> R-squared: 0.256 #> adjusted R-squared: 0.254 #># Pseudo-R-squared values fit <- glm(services ~ neg_c_7 + c161sex + e42dep, data = efc, family = binomial(link = "logit")) r2(fit)#> #> R-Squared for (Generalized) Linear (Mixed) Model #> #> Cox & Snell's R-squared: 0.023 #> Nagelkerke's R-squared: 0.030 #>