Weighted statistics for variables

weighted_sd(), weighted_se(), weighted_mean() and weighted_median() compute weighted standard deviation, standard error, mean or median for a variable or for all variables of a data frame. survey_median() computes the median for a variable in a survey-design (see svydesign). weighted_correlation() computes a weighted correlation for a two-sided alternative hypothesis.

Weighted tests

weighted_ttest() computes a weighted t-test, while weighted_mannwhitney() computes a weighted Mann-Whitney-U test or a Kruskal-Wallis test (for more than two groups). weighted_chisqtest() computes a weighted Chi-squared test for contigency tables.

survey_median(x, design)

weighted_chisqtest(data, ...)

# S3 method for default
weighted_chisqtest(data, x, y, weights, ...)

# S3 method for formula
weighted_chisqtest(formula, data, ...)

weighted_correlation(data, ...)

# S3 method for default
weighted_correlation(data, x, y, weights, ci.lvl = 0.95, ...)

# S3 method for formula
weighted_correlation(formula, data, ci.lvl = 0.95, ...)

weighted_mean(x, weights = NULL)

weighted_median(x, weights = NULL)

weighted_mannwhitney(data, ...)

# S3 method for default
weighted_mannwhitney(data, x, grp, weights, ...)

# S3 method for formula
weighted_mannwhitney(formula, data, ...)

weighted_sd(x, weights = NULL)

wtd_sd(x, weights = NULL)

weighted_se(x, weights = NULL)

weighted_ttest(data, ...)

# S3 method for default
weighted_ttest(
data,
x,
y = NULL,
weights,
mu = 0,
paired = FALSE,
ci.lvl = 0.95,
alternative = c("two.sided", "less", "greater"),
...
)

# S3 method for formula
weighted_ttest(
formula,
data,
mu = 0,
paired = FALSE,
ci.lvl = 0.95,
alternative = c("two.sided", "less", "greater"),
...
)

## Arguments

x (Numeric) vector or a data frame. For survey_median(), weighted_ttest(), weighted_mannwhitney() and weighted_chisqtest() the bare (unquoted) variable name, or a character vector with the variable name. An object of class svydesign, providing a specification of the survey design. A data frame. For weighted_ttest() and weighted_mannwhitney(), currently not used. For weighted_chisqtest(), further arguments passed down to chisq.test. Optional, bare (unquoted) variable name, or a character vector with the variable name. Bare (unquoted) variable name, or a character vector with the variable name of the numeric vector of weights. If weights = NULL, unweighted statistic is reported. A formula of the form lhs ~ rhs1 + rhs2 where lhs is a numeric variable giving the data values and rhs1 a factor with two levels giving the corresponding groups and rhs2 a variable with weights. Confidence level of the interval. Bare (unquoted) name of the cross-classifying variable, where x is grouped into the categories represented by grp, or a character vector with the variable name. A number indicating the true value of the mean (or difference in means if you are performing a two sample test). Logical, whether to compute a paired t-test. A character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". You can specify just the initial letter.

## Value

The weighted (test) statistic.

## Note

weighted_chisq() is a convenient wrapper for crosstable_statistics. For a weighted one-way Anova, use means_by_group() with weights-argument.

weighted_ttest() assumes unequal variance between the two groups.

## Examples

# weighted sd and se ----
weighted_sd(rnorm(n = 100, mean = 3), runif(n = 100))
#> [1] 0.8498705

data(efc)
weighted_sd(efc[, 1:3], runif(n = nrow(efc)))
#>    c12hour   e15relat     e16sex
#> 51.7876181  2.0540843  0.4699551
weighted_se(efc[, 1:3], runif(n = nrow(efc)))
#>    c12hour   e15relat     e16sex
#> 1.66065784 0.06942749 0.01562877

# survey_median ----
# median for variables from weighted survey designs
if (require("survey")) {
data(nhanes_sample)

des <- svydesign(
id = ~SDMVPSU,
strat = ~SDMVSTRA,
weights = ~WTINT2YR,
nest = TRUE,
data = nhanes_sample
)

survey_median(total, des)
survey_median("total", des)
}
#> $total #> 0.5 #> [1,] 6 #> #> attr(,"hasci") #> [1] FALSE #> attr(,"class") #> [1] "newsvyquantile" # weighted t-test ---- efc$weight <- abs(rnorm(nrow(efc), 1, .3))
weighted_ttest(efc, e17age, weights = weight)
#>
#> One Sample t-test (two.sided)
#> # t=292.68  df=890  p-value=0.000
#>
#>   mean of e17age: 79.189 [78.658, 79.720]
#>
weighted_ttest(efc, e17age, c160age, weights = weight)
#>
#> Two-Sample t-test (two.sided)
#>
#> # comparison between e17age and c160age
#> # t=49.92  df=1469  p-value=0.000
#>
#>   mean of e17age    : 79.187
#>   mean of c160age   : 53.208
#>   difference of mean: 25.980 [24.959  27.001]
#>
weighted_ttest(e17age ~ e16sex + weight, efc)
#>
#> Two-Sample t-test (two.sided)
#>
#> # comparison of e17age by e16sex
#> # t=-7.46  df=604  p-value=0.000
#>
#>   mean in group [1] male  : 76.401
#>   mean in group [2] female: 80.518
#>   difference of mean      : -4.117 [-5.201  -3.034]
#>

# weighted Mann-Whitney-U-test ----
weighted_mannwhitney(c12hour ~ c161sex + weight, efc)
#>
#> # Weighted Mann-Whitney-U test
#>
#>   comparison of c12hour by c161sex
#>   Chisq=3.26  df=899  p-value=0.001
#>

# weighted Chi-squared-test ----
weighted_chisqtest(efc, c161sex, e16sex, weights = weight, correct = FALSE)
#>
#> # Measure of Association for Contingency Tables
#>
#>    Chi-squared: 2.0566
#>            Phi: 0.0479
#>             df: 1
#>        p-value: 0.152
#>   Observations: 895
weighted_chisqtest(c172code ~ c161sex + weight, efc)
#>
#> # Measure of Association for Contingency Tables
#>
#>    Chi-squared: 4.8005
#>     Cramer's V: 0.0758
#>             df: 2
#>        p-value: 0.091
#>   Observations: 835

# weighted Chi-squared-test for given probabilities ----
weighted_chisqtest(c172code ~ weight, efc, p = c(.33, .33, .34))
#>
#> # Weighted chi-squared test for given probabilities
#>
#>    Chi-squared: 280.2808
#>             df: 2
#>        p-value: < .001***
#>   Observations: 908