This function extracts coefficients (and standard error and p-values) of fitted model objects from (nested) data frames, which are saved in a list-variable, and spreads the coefficients into new colummns.
spread_coef(data, model.column, model.term, se, p.val, append = TRUE)
A (nested) data frame with a list-variable that contains fitted model objects (see 'Details').
Name or index of the list-variable that contains the fitted model objects.
Optional, name of a model term. If specified, only this model term (including p-value) will be extracted from each model and added as new column.
Logical, if TRUE
, standard errors for estimates will also be extracted.
Logical, if TRUE
, p-values for estimates will also be extracted.
Logical, if TRUE
(default), this function returns
data
with new columns for the model coefficients; else,
a new data frame with model coefficients only are returned.
A data frame with columns for each coefficient of the models
that are stored in the list-variable of data
; or, if
model.term
is given, a data frame with the term's estimate.
If se = TRUE
or p.val = TRUE
, the returned data frame
also contains columns for the coefficients' standard error and
p-value.
If append = TRUE
, the columns are appended to data
,
i.e. data
is also returned.
This function requires a (nested) data frame (e.g. created by the
nest
-function of the tidyr-package),
where several fitted models are saved in a list-variable (see
'Examples'). Since nested data frames with fitted models stored as list-variable
are typically fit with an identical formula, all models have the same
dependent and independent variables and only differ in their
subsets of data. The function then extracts all coefficients from
each model and saves each estimate in a new column. The result
is a data frame, where each row is a model with each
model's coefficients in an own column.
if (require("dplyr") && require("tidyr") && require("purrr")) {
data(efc)
# create nested data frame, grouped by dependency (e42dep)
# and fit linear model for each group. These models are
# stored in the list variable "models".
model.data <- efc %>%
filter(!is.na(e42dep)) %>%
group_by(e42dep) %>%
nest() %>%
mutate(
models = map(data, ~lm(neg_c_7 ~ c12hour + c172code, data = .x))
)
# spread coefficients, so we can easily access and compare the
# coefficients over all models. arguments `se` and `p.val` default
# to `FALSE`, when `model.term` is not specified
spread_coef(model.data, models)
spread_coef(model.data, models, se = TRUE)
# select only specific model term. `se` and `p.val` default to `TRUE`
spread_coef(model.data, models, c12hour)
# spread_coef can be used directly within a pipe-chain
efc %>%
filter(!is.na(e42dep)) %>%
group_by(e42dep) %>%
nest() %>%
mutate(
models = map(data, ~lm(neg_c_7 ~ c12hour + c172code, data = .x))
) %>%
spread_coef(models)
# spread_coef() makes it easy to generate bootstrapped
# confidence intervals, using the 'bootstrap()' and 'boot_ci()'
# functions from the 'sjstats' package, which creates nested
# data frames of bootstrap replicates
if (require("sjstats")) {
efc %>%
# generate bootstrap replicates
bootstrap(100) %>%
# apply lm to all bootstrapped data sets
mutate(
models = map(strap, ~lm(neg_c_7 ~ e42dep + c161sex + c172code, data = .x))
) %>%
# spread model coefficient for all 100 models
spread_coef(models, se = FALSE, p.val = FALSE) %>%
# compute the CI for all bootstrapped model coefficients
boot_ci(e42dep, c161sex, c172code)
}
}
#> Loading required package: tidyr
#>
#> Attaching package: ‘tidyr’
#> The following object is masked from ‘package:sjmisc’:
#>
#> replace_na
#> Loading required package: sjstats
#> term conf.low conf.high
#> 1 e42dep 1.32208605 1.808307
#> 2 c161sex -0.02021748 1.244307
#> 3 c172code -0.18926654 0.684883