prop()
calculates the proportion of a value or category
in a variable. props()
does the same, but allows for
multiple logical conditions in one statement. It is similar
to mean()
with logical predicates, however, both
prop()
and props()
work with grouped data frames.
prop(data, ..., weights = NULL, na.rm = TRUE, digits = 4)
props(data, ..., na.rm = TRUE, digits = 4)
A data frame. May also be a grouped data frame (see 'Examples').
One or more value pairs of comparisons (logical predicates). Put variable names the left-hand-side and values to match on the right hand side. Expressions may be quoted or unquoted. See 'Examples'.
Vector of weights that will be applied to weight all observations.
Must be a vector of same length as the input vector. Default is
NULL
, so no weights are used.
Logical, whether to remove NA values from the vector when the
proportion is calculated. na.rm = FALSE
gives you the raw
percentage of a value in a vector, na.rm = TRUE
the valid
percentage.
Amount of digits for returned values.
For one condition, a numeric value with the proportion of the values inside a vector. For more than one condition, a data frame with one column of conditions and one column with proportions. For grouped data frames, returns a data frame with one column per group with grouping categories, followed by one column with proportions per condition.
prop()
only allows one logical statement per comparison,
while props()
allows multiple logical statements per comparison.
However, prop()
supports weighting of variables before calculating
proportions, and comparisons may also be quoted. Hence, prop()
also processes comparisons, which are passed as character vector
(see 'Examples').
data(efc)
# proportion of value 1 in e42dep
prop(efc, e42dep == 1)
#> [1] 0.0733
# expression may also be completely quoted
prop(efc, "e42dep == 1")
#> [1] 0.0733
# use "props()" for multiple logical statements
props(efc, e17age > 70 & e17age < 80)
#> [1] 0.3199
# proportion of value 1 in e42dep, and all values greater
# than 2 in e42dep, including missing values. will return a data frame
prop(efc, e42dep == 1, e42dep > 2, na.rm = FALSE)
#> condition prop
#> 1 e42dep==1 0.0727
#> 2 e42dep>2 0.6718
# for factors or character vectors, use quoted or unquoted values
library(datawizard)
#>
#> Attaching package: ‘datawizard’
#> The following object is masked from ‘package:sjstats’:
#>
#> means_by_group
# convert numeric to factor, using labels as factor levels
efc$e16sex <- to_factor(efc$e16sex)
efc$n4pstu <- to_factor(efc$n4pstu)
# get proportion of female older persons
prop(efc, e16sex == female)
#> [1] 0.6715
# get proportion of male older persons
prop(efc, e16sex == "male")
#> [1] 0.3285
# "props()" needs quotes around non-numeric factor levels
props(efc,
e17age > 70 & e17age < 80,
n4pstu == 'Care Level 1' | n4pstu == 'Care Level 3'
)
#> condition prop
#> 1 e17age>70&e17age<80 0.3199
#> 2 n4pstu==CareLevel1|n4pstu==CareLevel3 0.3137
# also works with pipe-chains
efc |> prop(e17age > 70)
#> [1] 0.8092
efc |> prop(e17age > 70, e16sex == 1)
#> condition prop
#> 1 e17age>70 0.8092
#> 2 e16sex==1 0.0000