Duplicated value labels in variables may cause troubles when
saving labelled data, or computing cross tabs (cf.
sjmisc::flat_table()
or sjPlot::plot_xtab()
).
tidy_labels()
repairs duplicated value labels by suffixing
them with the associated value.
tidy_labels(x, ..., sep = "_", remove = FALSE)
A vector or data frame.
Optional, unquoted names of variables that should be selected for
further processing. Required, if x
is a data frame (and no
vector) and only selected variables from x
should be processed.
You may also use functions like :
or tidyselect's select-helpers.
See 'Examples'.
String that will be used to separate the suffixed value from the old label when creating the new value label.
Logical, if TRUE
, the original, duplicated value label will
be replaced by the value (i.e. the value is not the suffix of the
value label, but will become the value label itself). The
sep
-argument will be ignored in such cases.
x
, with "repaired" (unique) value labels for each variable.
if (require("sjmisc")) {
set.seed(123)
x <- set_labels(
sample(1:5, size = 20, replace = TRUE),
labels = c("low" = 1, ".." = 2, ".." = 3, ".." = 4, "high" = 5)
)
frq(x)
z <- tidy_labels(x)
frq(z)
z <- tidy_labels(x, sep = ".")
frq(z)
z <- tidy_labels(x, remove = TRUE)
frq(z)
}
#> x <integer>
#> # total N=20 valid N=20 mean=2.80 sd=1.32
#>
#> Value | Label | N | Raw % | Valid % | Cum. %
#> --------------------------------------------
#> 1 | low | 4 | 20 | 20 | 20
#> 2 | 2 | 4 | 20 | 20 | 40
#> 3 | 3 | 7 | 35 | 35 | 75
#> 4 | 4 | 2 | 10 | 10 | 85
#> 5 | high | 3 | 15 | 15 | 100
#> <NA> | <NA> | 0 | 0 | <NA> | <NA>