Computes mean, sd and se for each sub-group (indicated by grp) of dv.

means_by_group(
  x,
  dv,
  grp,
  weights = NULL,
  digits = 2,
  out = c("txt", "viewer", "browser"),
  encoding = "UTF-8",
  file = NULL
)

grpmean(
  x,
  dv,
  grp,
  weights = NULL,
  digits = 2,
  out = c("txt", "viewer", "browser"),
  encoding = "UTF-8",
  file = NULL
)

Arguments

x

A (grouped) data frame.

dv

Name of the dependent variable, for which the mean value, grouped by grp, is computed.

grp

Factor with the cross-classifying variable, where dv is grouped into the categories represented by grp. Numeric vectors are coerced to factors.

weights

Name of variable in x that indicated the vector of weights that will be applied to weight all observations. Default is NULL, so no weights are used.

digits

Numeric, amount of digits after decimal point when rounding estimates and values.

out

Character vector, indicating whether the results should be printed to console (out = "txt") or as HTML-table in the viewer-pane (out = "viewer") or browser (out = "browser"), of if the results should be plotted (out = "plot", only applies to certain functions). May be abbreviated.

encoding

Character vector, indicating the charset encoding used for variable and value labels. Default is "UTF-8". Only used when out is not "txt".

file

Destination file, if the output should be saved as file. Only used when out is not "txt".

Value

For non-grouped data frames, means_by_group() returns a data frame with following columns: term, mean, N, std.dev, std.error and p.value. For grouped data frames, returns a list of such data frames.

Details

This function performs a One-Way-Anova with dv as dependent and grp as independent variable, by calling lm(count ~ as.factor(grp)). Then contrast is called to get p-values for each sub-group. P-values indicate whether each group-mean is significantly different from the total mean.

Examples

data(efc)
means_by_group(efc, c12hour, e42dep)
#> 
#> # Grouped Means for average number of hours of care per week by elder's dependency
#> 
#> Category             |  Mean |   N |    SD |   SE |      p
#> ----------------------------------------------------------
#> independent          |  9.91 |  66 |  8.01 | 0.99 | < .001
#> slightly dependent   | 17.54 | 225 | 17.74 | 1.18 | < .001
#> moderately dependent | 34.52 | 306 | 41.54 | 2.37 |  0.983
#> severely dependent   | 75.90 | 304 | 61.72 | 3.54 | < .001
#> Total                | 42.44 | 901 | 50.82 | 1.69 |       
#> 
#> Anova: R2=0.245; adj.R2=0.242; F=96.908; p=0.000

data(iris)
means_by_group(iris, Sepal.Width, Species)
#> 
#> # Grouped Means for Sepal.Width by Species
#> 
#> Category   | Mean |   N |   SD |   SE |      p
#> ----------------------------------------------
#> setosa     | 3.43 |  50 | 0.38 | 0.05 | < .001
#> versicolor | 2.77 |  50 | 0.31 | 0.04 | < .001
#> virginica  | 2.97 |  50 | 0.32 | 0.05 |  0.035
#> Total      | 3.06 | 150 | 0.44 | 0.04 |       
#> 
#> Anova: R2=0.401; adj.R2=0.393; F=49.160; p=0.000

# also works for grouped data frames
if (require("dplyr")) {
  efc %>%
    group_by(c172code) %>%
    means_by_group(c12hour, e42dep)
}
#> 
#> Grouped by:
#> carer's level of education: low level of education
#> 
#> # Grouped Means for average number of hours of care per week by elder's dependency
#> 
#> Category             |  Mean |   N |    SD |   SE |      p
#> ----------------------------------------------------------
#> independent          | 16.33 |  12 | 10.74 | 3.10 |  0.024
#> slightly dependent   | 15.38 |  42 |  9.55 | 1.47 | < .001
#> moderately dependent | 42.05 |  61 | 46.53 | 5.96 |  0.696
#> severely dependent   | 85.52 |  65 | 56.42 | 7.00 | < .001
#> Total                | 49.81 | 180 | 52.24 | 3.89 |       
#> 
#> Anova: R2=0.307; adj.R2=0.295; F=25.955; p=0.000
#> 
#> 
#> Grouped by:
#> carer's level of education: intermediate level of education
#> 
#> # Grouped Means for average number of hours of care per week by elder's dependency
#> 
#> Category             |  Mean |   N |    SD |   SE |      p
#> ----------------------------------------------------------
#> independent          |  7.96 |  45 |  3.91 | 0.58 | < .001
#> slightly dependent   | 17.12 | 135 | 16.52 | 1.42 | < .001
#> moderately dependent | 33.55 | 163 | 41.05 | 3.22 |  0.753
#> severely dependent   | 79.71 | 163 | 63.13 | 4.94 | < .001
#> Total                | 41.76 | 506 | 51.42 | 2.29 |       
#> 
#> Anova: R2=0.284; adj.R2=0.280; F=66.374; p=0.000
#> 
#> 
#> Grouped by:
#> carer's level of education: high level of education
#> 
#> # Grouped Means for average number of hours of care per week by elder's dependency
#> 
#> Category             |  Mean |   N |    SD |   SE |      p
#> ----------------------------------------------------------
#> independent          | 15.20 |   5 | 18.43 | 8.24 |  0.363
#> slightly dependent   | 18.08 |  39 | 12.98 | 2.08 |  0.146
#> moderately dependent | 28.42 |  62 | 35.64 | 4.53 |  0.670
#> severely dependent   | 63.38 |  50 | 62.69 | 8.87 | < .001
#> Total                | 36.62 | 156 | 46.38 | 3.71 |       
#> 
#> Anova: R2=0.167; adj.R2=0.151; F=10.155; p=0.000
#> 
#> 

# weighting
efc$weight <- abs(rnorm(n = nrow(efc), mean = 1, sd = .5))
means_by_group(efc, c12hour, e42dep, weights = weight)
#> 
#> # Grouped Means for average number of hours of care per week by elder's dependency
#> 
#> Category             |  Mean |   N |    SD |   SE |      p
#> ----------------------------------------------------------
#> independent          | 10.11 |  69 |  8.18 | 1.01 | < .001
#> slightly dependent   | 18.45 | 218 | 20.62 | 1.37 | < .001
#> moderately dependent | 35.08 | 312 | 42.33 | 2.42 |  0.874
#> severely dependent   | 75.07 | 307 | 62.00 | 3.56 | < .001
#> Total                | 42.73 | 901 | 51.19 | 1.71 |       
#> 
#> Anova: R2=0.228; adj.R2=0.226; F=88.462; p=0.000