prop() calculates the proportion of a value or category in a variable. props() does the same, but allows for multiple logical conditions in one statement. It is similar to mean() with logical predicates, however, both prop() and props() work with grouped data frames.

prop(data, ..., weights = NULL, na.rm = TRUE, digits = 4)

props(data, ..., na.rm = TRUE, digits = 4)

Arguments

data

A data frame. May also be a grouped data frame (see 'Examples').

...

One or more value pairs of comparisons (logical predicates). Put variable names the left-hand-side and values to match on the right hand side. Expressions may be quoted or unquoted. See 'Examples'.

weights

Vector of weights that will be applied to weight all observations. Must be a vector of same length as the input vector. Default is NULL, so no weights are used.

na.rm

Logical, whether to remove NA values from the vector when the proportion is calculated. na.rm = FALSE gives you the raw percentage of a value in a vector, na.rm = TRUE the valid percentage.

digits

Amount of digits for returned values.

Value

For one condition, a numeric value with the proportion of the values inside a vector. For more than one condition, a data frame with one column of conditions and one column with proportions. For grouped data frames, returns a data frame with one column per group with grouping categories, followed by one column with proportions per condition.

Details

prop() only allows one logical statement per comparison, while props() allows multiple logical statements per comparison. However, prop() supports weighting of variables before calculating proportions, and comparisons may also be quoted. Hence, prop() also processes comparisons, which are passed as character vector (see 'Examples').

Examples

data(efc)

# proportion of value 1 in e42dep
prop(efc, e42dep == 1)
#> [1] 0.0733

# expression may also be completely quoted
prop(efc, "e42dep == 1")
#> [1] 0.0733

# use "props()" for multiple logical statements
props(efc, e17age > 70 & e17age < 80)
#> [1] 0.3199

# proportion of value 1 in e42dep, and all values greater
# than 2 in e42dep, including missing values. will return a data frame
prop(efc, e42dep == 1, e42dep > 2, na.rm = FALSE)
#>   condition   prop
#> 1 e42dep==1 0.0727
#> 2  e42dep>2 0.6718

# for factors or character vectors, use quoted or unquoted values
library(datawizard)
#> 
#> Attaching package: ‘datawizard’
#> The following object is masked from ‘package:sjstats’:
#> 
#>     means_by_group
# convert numeric to factor, using labels as factor levels
efc$e16sex <- to_factor(efc$e16sex)
efc$n4pstu <- to_factor(efc$n4pstu)

# get proportion of female older persons
prop(efc, e16sex == female)
#> [1] 0.6715

# get proportion of male older persons
prop(efc, e16sex == "male")
#> [1] 0.3285

# "props()" needs quotes around non-numeric factor levels
props(efc,
  e17age > 70 & e17age < 80,
  n4pstu == 'Care Level 1' | n4pstu == 'Care Level 3'
)
#>                               condition   prop
#> 1                   e17age>70&e17age<80 0.3199
#> 2 n4pstu==CareLevel1|n4pstu==CareLevel3 0.3137

# also works with pipe-chains
efc |> prop(e17age > 70)
#> [1] 0.8092
efc |> prop(e17age > 70, e16sex == 1)
#>   condition   prop
#> 1 e17age>70 0.8092
#> 2 e16sex==1 0.0000