Compute nonparametric bootstrap estimate, standard error, confidence intervals and p-value for a vector of bootstrap replicate estimates.

boot_ci(data, ..., method = c("dist", "quantile"), ci.lvl = 0.95)

boot_se(data, ...)

boot_p(data, ...)

boot_est(data, ...)

Arguments

data

A data frame that containts the vector with bootstrapped estimates, or directly the vector (see 'Examples').

...

Optional, unquoted names of variables with bootstrapped estimates. Required, if either data is a data frame (and no vector), and only selected variables from data should be processed. You may also use functions like : or tidyselect's select_helpers().

method

Character vector, indicating if confidence intervals should be based on bootstrap standard error, multiplied by the value of the quantile function of the t-distribution (default), or on sample quantiles of the bootstrapped values. See 'Details' in boot_ci(). May be abbreviated.

ci.lvl

Numeric, the level of the confidence intervals.

Value

A data frame with either bootstrap estimate, standard error, the lower and upper confidence intervals or the p-value for all bootstrapped estimates.

Details

The methods require one or more vectors of bootstrap replicate estimates as input.

  • boot_est() returns the bootstrapped estimate, simply by computing the mean value of all bootstrap estimates.

  • boot_se() computes the nonparametric bootstrap standard error by calculating the standard deviation of the input vector.

  • The mean value of the input vector and its standard error is used by boot_ci() to calculate the lower and upper confidence interval, assuming a t-distribution of bootstrap estimate replicates (for method = "dist", the default, which is mean(x) +/- qt(.975, df = length(x) - 1) * sd(x)); for method = "quantile", 95% sample quantiles are used to compute the confidence intervals (quantile(x, probs = c(.025, .975))). Use ci.lvl to change the level for the confidence interval.

  • P-values from boot_p() are also based on t-statistics, assuming normal distribution.

References

Carpenter J, Bithell J. Bootstrap confdence intervals: when, which, what? A practical guide for medical statisticians. Statist. Med. 2000; 19:1141-1164

See also

bootstrap to generate nonparametric bootstrap samples.

Examples

library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
library(purrr)
#> 
#> Attaching package: ‘purrr’
#> The following object is masked from ‘package:sjmisc’:
#> 
#>     is_empty
data(efc)
bs <- bootstrap(efc, 100)

# now run models for each bootstrapped sample
bs$models <- map(bs$strap, ~lm(neg_c_7 ~ e42dep + c161sex, data = .x))

# extract coefficient "dependency" and "gender" from each model
bs$dependency <- map_dbl(bs$models, ~coef(.x)[2])
bs$gender <- map_dbl(bs$models, ~coef(.x)[3])

# get bootstrapped confidence intervals
boot_ci(bs$dependency)
#>   term conf.low conf.high
#> 1    x 1.303605  1.776894

# compare with model fit
fit <- lm(neg_c_7 ~ e42dep + c161sex, data = efc)
confint(fit)[2, ]
#>    2.5 %   97.5 % 
#> 1.292945 1.796430 

# alternative function calls.
boot_ci(bs$dependency)
#>   term conf.low conf.high
#> 1    x 1.303605  1.776894
boot_ci(bs, dependency)
#>         term conf.low conf.high
#> 1 dependency 1.303605  1.776894
boot_ci(bs, dependency, gender)
#>         term   conf.low conf.high
#> 1 dependency  1.3036054  1.776894
#> 2     gender -0.1608034  1.028561
boot_ci(bs, dependency, gender, method = "q")
#>         term   conf.low conf.high
#> 1 dependency  1.3005524  1.737904
#> 2     gender -0.1447726  1.003599


# compare coefficients
mean(bs$dependency)
#> [1] 1.54025
boot_est(bs$dependency)
#>   term estimate
#> 1    x  1.54025
coef(fit)[2]
#>   e42dep 
#> 1.544687 


# bootstrap() and boot_ci() work fine within pipe-chains
efc %>%
  bootstrap(100) %>%
  mutate(
    models = map(strap, ~lm(neg_c_7 ~ e42dep + c161sex, data = .x)),
    dependency = map_dbl(models, ~coef(.x)[2])
  ) %>%
  boot_ci(dependency)
#>         term conf.low conf.high
#> 1 dependency 1.280782  1.802551

# check p-value
boot_p(bs$gender)
#>   term   p.value
#> 1    x 0.1508658
summary(fit)$coefficients[3, ]
#>   Estimate Std. Error    t value   Pr(>|t|) 
#>  0.4339069  0.2818786  1.5393398  0.1240780 

if (FALSE) {
# 'spread_coef()' from the 'sjmisc'-package makes it easy to generate
# bootstrapped statistics like confidence intervals or p-values
library(dplyr)
library(sjmisc)
efc %>%
  # generate bootstrap replicates
  bootstrap(100) %>%
  # apply lm to all bootstrapped data sets
  mutate(
    models = map(strap, ~lm(neg_c_7 ~ e42dep + c161sex + c172code, data = .x))
  ) %>%
  # spread model coefficient for all 100 models
  spread_coef(models) %>%
  # compute the CI for all bootstrapped model coefficients
  boot_ci(e42dep, c161sex, c172code)

# or...
efc %>%
  # generate bootstrap replicates
  bootstrap(100) %>%
  # apply lm to all bootstrapped data sets
  mutate(
    models = map(strap, ~lm(neg_c_7 ~ e42dep + c161sex + c172code, data = .x))
  ) %>%
  # spread model coefficient for all 100 models
  spread_coef(models, append = FALSE) %>%
  # compute the CI for all bootstrapped model coefficients
  boot_ci()}