This function prints a basic descriptive statistic, including variable labels.

descr(
  x,
  ...,
  max.length = NULL,
  weights = NULL,
  show = "all",
  out = c("txt", "viewer", "browser"),
  encoding = "UTF-8",
  file = NULL
)

Arguments

x

A vector or a data frame. May also be a grouped data frame (see 'Note' and 'Examples').

...

Optional, unquoted names of variables that should be selected for further processing. Required, if x is a data frame (and no vector) and only selected variables from x should be processed. You may also use functions like : or tidyselect's select-helpers. See 'Examples' or package-vignette.

max.length

Numeric, indicating the maximum length of variable labels in the output. If variable names are longer than max.length, they will be shortened to the last whole word within the first max.length chars.

weights

Bare name, or name as string, of a variable in x that indicates the vector of weights, which will be applied to weight all observations. Default is NULL, so no weights are used.

show

Character vector, indicating which information (columns) that describe the data should be returned. May be one or more of "type", "label", "n", "NA.prc", "mean", "sd", "se", "md", "trimmed", "range", "iqr", "skew". There are two shortcuts: show = "all" (default) shows all information, show = "short" just shows n, missing percentage, mean and standard deviation.

out

Character vector, indicating whether the results should be printed to console (out = "txt") or as HTML-table in the viewer-pane (out = "viewer") or browser (out = "browser").

encoding

Character vector, indicating the charset encoding used for variable and value labels. Default is "UTF-8". Only used when out is not "txt".

file

Destination file, if the output should be saved as file. Only used when out is not "txt".

Value

A data frame with basic descriptive statistics.

Note

data may also be a grouped data frame (see group_by) with up to two grouping variables. Descriptive tables are created for each subgroup then.

Examples

data(efc)
descr(efc, e17age, c160age)
#> 
#> ## Basic descriptive statistics
#> 
#>      var    type      label   n NA.prc  mean    sd   se md trimmed       range
#>   e17age numeric elder' age 891   1.87 79.12  8.09 0.27 79   79.05 38 (65-103)
#>  c160age numeric carer' age 901   0.77 53.46 13.35 0.44 54   53.68  71 (18-89)
#>  iqr  skew
#>   12  0.06
#>   19 -0.14

efc$weights <- abs(rnorm(nrow(efc), 1, .3))
descr(efc, c12hour, barthtot, weights = weights)
#> 
#> ## Basic descriptive statistics
#> 
#>       var    type                                    label   n NA.prc  mean
#>   c12hour numeric average number of hours of care per week 888   0.66 42.02
#>  barthtot numeric                Total score BARTHEL INDEX 872   2.75 64.65
#>     sd   se       range   iqr  skew
#>  50.58 1.68 164 (4-168) 32.75  1.66
#>  29.81 1.00 100 (0-100) 45.00 -0.73

library(dplyr)
efc %>% select(e42dep, e15relat, c172code) %>% descr()
#> 
#> ## Basic descriptive statistics
#> 
#>       var    type                      label   n NA.prc mean   sd   se md
#>    e42dep numeric         elder's dependency 901   0.77 2.94 0.94 0.03  3
#>  e15relat numeric      relationship to elder 901   0.77 2.85 2.08 0.07  2
#>  c172code numeric carer's level of education 842   7.27 1.97 0.63 0.02  2
#>  trimmed   range iqr  skew
#>     3.02 3 (1-4)   2 -0.42
#>     2.44 7 (1-8)   2  1.56
#>     1.96 2 (1-3)   0  0.02

# show just a few elements
efc %>% select(e42dep, e15relat, c172code) %>% descr(show = "short")
#> 
#> ## Basic descriptive statistics
#> 
#>       var   n NA.prc mean   sd
#>    e42dep 901   0.77 2.94 0.94
#>  e15relat 901   0.77 2.85 2.08
#>  c172code 842   7.27 1.97 0.63

# with grouped data frames
efc %>%
  group_by(e16sex) %>%
  select(e16sex, e42dep, e15relat, c172code) %>%
  descr()
#> 
#> ## Basic descriptive statistics
#> 
#> 
#> Grouped by: male
#> 
#>       var    type                      label   n NA.prc mean   sd   se md
#>    e42dep numeric         elder's dependency 295   0.34 2.92 0.93 0.05  3
#>  e15relat numeric      relationship to elder 296   0.00 2.32 1.93 0.11  2
#>  c172code numeric carer's level of education 279   5.74 1.87 0.65 0.04  2
#>  trimmed   range iqr  skew
#>     3.00 3 (1-4)   2 -0.44
#>     1.86 7 (1-8)   1  1.98
#>     1.84 2 (1-3)   1  0.14
#> 
#> 
#> Grouped by: female
#> 
#>       var    type                      label   n NA.prc mean   sd   se md
#>    e42dep numeric         elder's dependency 605   0.00 2.95 0.94 0.04  3
#>  e15relat numeric      relationship to elder 604   0.17 3.11 2.11 0.09  2
#>  c172code numeric carer's level of education 562   7.11 2.02 0.61 0.03  2
#>  trimmed   range iqr  skew
#>     3.03 3 (1-4)   2 -0.42
#>     2.74 7 (1-8)   2  1.45
#>     2.03 2 (1-3)   0 -0.01

# you can select variables also inside 'descr()'
efc %>%
  group_by(e16sex, c172code) %>%
  descr(e16sex, c172code, e17age, c160age)
#> New names:
#> * c172code -> c172code...1
#> * e16sex -> e16sex...2
#> * e16sex -> e16sex...3
#> * c172code -> c172code...4
#> * e16sex -> e16sex...5
#> * ...
#> 
#> ## Basic descriptive statistics
#> 
#>           var    type                      label   n NA.prc  mean    sd   se md
#>  c172code...1 numeric carer's level of education 842   7.27  1.97  0.63 0.02  2
#>    e16sex...2 numeric             elder's gender 901   0.77  1.67  0.47 0.02  2
#>    e16sex...3 numeric             elder's gender 901   0.77  1.67  0.47 0.02  2
#>  c172code...4 numeric carer's level of education 842   7.27  1.97  0.63 0.02  2
#>    e16sex...5 numeric             elder's gender 901   0.77  1.67  0.47 0.02  2
#>  c172code...6 numeric carer's level of education 842   7.27  1.97  0.63 0.02  2
#>        e17age numeric                 elder' age 891   1.87 79.12  8.09 0.27 79
#>    e16sex...8 numeric             elder's gender 901   0.77  1.67  0.47 0.02  2
#>  c172code...9 numeric carer's level of education 842   7.27  1.97  0.63 0.02  2
#>       c160age numeric                 carer' age 901   0.77 53.46 13.35 0.44 54
#>  trimmed       range iqr  skew
#>     1.96     2 (1-3)   0  0.02
#>     1.71     1 (1-2)   1 -0.73
#>     1.71     1 (1-2)   1 -0.73
#>     1.96     2 (1-3)   0  0.02
#>     1.71     1 (1-2)   1 -0.73
#>     1.96     2 (1-3)   0  0.02
#>    79.05 38 (65-103)  12  0.06
#>     1.71     1 (1-2)   1 -0.73
#>     1.96     2 (1-3)   0  0.02
#>    53.68  71 (18-89)  19 -0.14

# or even use select-helpers
descr(efc, contains("cop"), max.length = 20)
#> 
#> ## Basic descriptive statistics
#> 
#>      var    type                   label   n NA.prc mean   sd   se md trimmed
#>  c82cop1 numeric do you feel you cope... 901   0.77 3.12 0.58 0.02  3    3.15
#>  c83cop2 numeric          do you find... 902   0.66 2.02 0.72 0.02  2    1.98
#>  c84cop3 numeric      does caregiving... 902   0.66 1.63 0.87 0.03  1    1.47
#>  c85cop4 numeric does caregiving have... 898   1.10 1.77 0.87 0.03  2    1.63
#>  c86cop5 numeric      does caregiving... 902   0.66 1.39 0.67 0.02  1    1.26
#>  c87cop6 numeric      does caregiving... 900   0.88 1.29 0.64 0.02  1    1.13
#>  c88cop7 numeric  do you feel trapped... 900   0.88 1.92 0.91 0.03  2    1.80
#>  c89cop8 numeric          do you feel... 901   0.77 2.16 1.04 0.03  2    2.08
#>  c90cop9 numeric          do you feel... 888   2.20 2.93 0.96 0.03  3    3.02
#>    range iqr  skew
#>  3 (1-4)   0 -0.12
#>  3 (1-4)   0  0.65
#>  3 (1-4)   1  1.31
#>  3 (1-4)   1  1.06
#>  3 (1-4)   1  1.77
#>  3 (1-4)   0  2.43
#>  3 (1-4)   1  0.83
#>  3 (1-4)   2  0.32
#>  3 (1-4)   2 -0.45