Test and training error from model cross-validation

cv_error() computes the root mean squared error from a model fitted to kfold cross-validated test-training-data. cv_compare() does the same, for multiple formulas at once (by calling cv_error() for each formula).

cv_error(data, formula, k = 5)

cv_compare(data, formulas, k = 5)

Arguments

data: A data frame.
formula: The formula to fit the linear model for the test and training data.
k: The number of folds for the kfold-crossvalidation.
formulas: A list of formulas, to fit linear models for the test and training data.

Value

A data frame with the root mean squared errors for the training and test data.

Details

cv_error() first generates cross-validated test-training pairs, using crossv_kfold and then fits a linear model, which is described in formula, to the training data. Then, predictions for the test data are computed, based on the trained models. The training error is the mean value of the rmse for all trained models; the test error is the rmse based on all residuals from the test data.

Examples

data(efc)
cv_error(efc, neg_c_7 ~ barthtot + c161sex)
#>                          model train.error test.error
#> 1 neg_c_7 ~ barthtot + c161sex      3.4813     3.5537

cv_compare(efc, formulas = list(
  neg_c_7 ~ barthtot + c161sex,
  neg_c_7 ~ barthtot + c161sex + e42dep,
  neg_c_7 ~ barthtot + c12hour
))
#>                                   model train.error test.error
#> 1          neg_c_7 ~ barthtot + c161sex      3.4853     3.5787
#> 2 neg_c_7 ~ barthtot + c161sex + e42dep      3.4485     3.5648
#> 3          neg_c_7 ~ barthtot + c12hour      3.5657     3.1543