Compute nonparametric bootstrap estimate, standard error, confidence intervals and p-value for a vector of bootstrap replicate estimates.
boot_ci(data, ..., method = c("dist", "quantile"), ci.lvl = 0.95)
boot_se(data, ...)
boot_p(data, ...)
boot_est(data, ...)
A data frame that containts the vector with bootstrapped estimates, or directly the vector (see 'Examples').
Optional, unquoted names of variables with bootstrapped estimates.
Required, if either data
is a data frame (and no vector),
and only selected variables from data
should be processed.
You may also use functions like :
or tidyselect's
select_helpers()
.
Character vector, indicating if confidence intervals should be
based on bootstrap standard error, multiplied by the value of the
quantile function of the t-distribution (default), or on sample
quantiles of the bootstrapped values. See 'Details' in boot_ci()
.
May be abbreviated.
Numeric, the level of the confidence intervals.
A data frame with either bootstrap estimate, standard error, the lower and upper confidence intervals or the p-value for all bootstrapped estimates.
The methods require one or more vectors of bootstrap replicate estimates as input.
boot_est()
returns the bootstrapped estimate, simply by
computing the mean value of all bootstrap estimates.
boot_se()
computes the nonparametric bootstrap standard
error by calculating the standard deviation of the input vector.
The mean value of the input vector and its standard error is used
by boot_ci()
to calculate the lower and upper confidence
interval, assuming a t-distribution of bootstrap estimate replicates
(for method = "dist"
, the default, which is
mean(x) +/- qt(.975, df = length(x) - 1) * sd(x)
); for
method = "quantile"
, 95% sample quantiles are used to compute
the confidence intervals (quantile(x, probs = c(.025, .975))
).
Use ci.lvl
to change the level for the confidence interval.
P-values from boot_p()
are also based on t-statistics,
assuming normal distribution.
Carpenter J, Bithell J. Bootstrap confdence intervals: when, which, what? A practical guide for medical statisticians. Statist. Med. 2000; 19:1141-1164
bootstrap
to generate nonparametric bootstrap samples.
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
library(purrr)
#>
#> Attaching package: ‘purrr’
#> The following object is masked from ‘package:sjmisc’:
#>
#> is_empty
data(efc)
bs <- bootstrap(efc, 100)
# now run models for each bootstrapped sample
bs$models <- map(bs$strap, ~lm(neg_c_7 ~ e42dep + c161sex, data = .x))
# extract coefficient "dependency" and "gender" from each model
bs$dependency <- map_dbl(bs$models, ~coef(.x)[2])
bs$gender <- map_dbl(bs$models, ~coef(.x)[3])
# get bootstrapped confidence intervals
boot_ci(bs$dependency)
#> term conf.low conf.high
#> 1 x 1.303605 1.776894
# compare with model fit
fit <- lm(neg_c_7 ~ e42dep + c161sex, data = efc)
confint(fit)[2, ]
#> 2.5 % 97.5 %
#> 1.292945 1.796430
# alternative function calls.
boot_ci(bs$dependency)
#> term conf.low conf.high
#> 1 x 1.303605 1.776894
boot_ci(bs, dependency)
#> term conf.low conf.high
#> 1 dependency 1.303605 1.776894
boot_ci(bs, dependency, gender)
#> term conf.low conf.high
#> 1 dependency 1.3036054 1.776894
#> 2 gender -0.1608034 1.028561
boot_ci(bs, dependency, gender, method = "q")
#> term conf.low conf.high
#> 1 dependency 1.3005524 1.737904
#> 2 gender -0.1447726 1.003599
# compare coefficients
mean(bs$dependency)
#> [1] 1.54025
boot_est(bs$dependency)
#> term estimate
#> 1 x 1.54025
coef(fit)[2]
#> e42dep
#> 1.544687
# bootstrap() and boot_ci() work fine within pipe-chains
efc %>%
bootstrap(100) %>%
mutate(
models = map(strap, ~lm(neg_c_7 ~ e42dep + c161sex, data = .x)),
dependency = map_dbl(models, ~coef(.x)[2])
) %>%
boot_ci(dependency)
#> term conf.low conf.high
#> 1 dependency 1.280782 1.802551
# check p-value
boot_p(bs$gender)
#> term p.value
#> 1 x 0.1508658
summary(fit)$coefficients[3, ]
#> Estimate Std. Error t value Pr(>|t|)
#> 0.4339069 0.2818786 1.5393398 0.1240780
if (FALSE) {
# 'spread_coef()' from the 'sjmisc'-package makes it easy to generate
# bootstrapped statistics like confidence intervals or p-values
library(dplyr)
library(sjmisc)
efc %>%
# generate bootstrap replicates
bootstrap(100) %>%
# apply lm to all bootstrapped data sets
mutate(
models = map(strap, ~lm(neg_c_7 ~ e42dep + c161sex + c172code, data = .x))
) %>%
# spread model coefficient for all 100 models
spread_coef(models) %>%
# compute the CI for all bootstrapped model coefficients
boot_ci(e42dep, c161sex, c172code)
# or...
efc %>%
# generate bootstrap replicates
bootstrap(100) %>%
# apply lm to all bootstrapped data sets
mutate(
models = map(strap, ~lm(neg_c_7 ~ e42dep + c161sex + c172code, data = .x))
) %>%
# spread model coefficient for all 100 models
spread_coef(models, append = FALSE) %>%
# compute the CI for all bootstrapped model coefficients
boot_ci()}