This function adds labels as attribute (named "labels") to a variable or vector x, resp. to a set of variables in a data frame or a list-object. A use-case is, for instance, the sjPlot-package, which supports labelled data and automatically assigns labels to axes or legends in plots or to be used in tables. val_labels() is intended for use within pipe-workflows and has a tidyverse-consistent syntax, including support for quasi-quotation (see 'Examples').

set_labels(x, ..., labels, force.labels = FALSE, force.values = TRUE,
  drop.na = TRUE)

val_labels(x, ..., force.labels = FALSE, force.values = TRUE,
  drop.na = TRUE)

Arguments

x

A vector or data frame.

...

For set_labels(), Optional, unquoted names of variables that should be selected for further processing. Required, if x is a data frame (and no vector) and only selected variables from x should be processed. You may also use functions like : or tidyselect's select_helpers.

For val_labels(), pairs of named vectors, where the name equals the variable name, which should be labelled, and the value is the new variable label. val_labels() also supports quasi-quotation (see 'Examples').

labels

(Named) character vector of labels that will be added to x as "labels" or "value.labels" attribute.

  • if labels is not a named vector, its length must equal the value range of x, i.e. if x has values from 1 to 3, labels should have a length of 3;

  • if length of labels is intended to differ from length of unique values of x, a warning is given. You can still add missing labels with the force.labels or force.values arguments; see 'Note'.

  • if labels is a named vector, value labels will be set accordingly, even if x has a different length of unique values. See 'Note' and 'Examples'.

  • if x is a data frame, labels may also be a list of (named) character vectors;

  • if labels is a list, it must have the same length as number of columns of x;

  • if labels is a vector and x is a data frame, labels will be applied to each column of x.

Use labels = "" to remove labels-attribute from x.

force.labels

Logical; if TRUE, all labels are added as value label attribute, even if x has less unique values then length of labels or if x has a smaller range then length of labels. See 'Examples'. This parameter will be ignored, if labels is a named vector.

force.values

Logical, if TRUE (default) and labels has less elements than unique values of x, additional values not covered by labels will be added as label as well. See 'Examples'. This parameter will be ignored, if labels is a named vector.

drop.na

Logical, whether existing value labels of tagged NA values (see tagged_na) should be removed (drop.na = TRUE, the default) or preserved (drop.na = FALSE). See get_na for more details on tagged NA values.

Value

x with value label attributes; or with removed label-attributes if labels = "". If x is a data frame, the complete data frame x will be returned, with removed or added to variables specified in ...; if ... is not specified, applies to all variables in the data frame.

Note

  • if labels is a named vector, force.labels and force.values will be ignored, and only values defined in labels will be labelled;

  • if x has less unique values than labels, redundant labels will be dropped, see force.labels;

  • if x has more unique values than labels, only matching values will be labelled, other values remain unlabelled, see force.values;

If you only want to change partial value labels, use add_labels instead. Furthermore, see 'Note' in get_labels.

See also

See vignette Labelled Data and the sjlabelled-Package for more details; set_label to manually set variable labels or get_label to get variable labels; add_labels to add additional value labels without replacing the existing ones.

Examples

library(sjmisc) dummy <- sample(1:4, 40, replace = TRUE) frq(dummy)
#> #> x <integer> #> # total N=40 valid N=40 mean=2.75 sd=1.10 #> #> val frq raw.prc valid.prc cum.prc #> 1 8 20 20 20 #> 2 6 15 15 35 #> 3 14 35 35 70 #> 4 12 30 30 100 #> <NA> 0 0 NA NA #>
dummy <- set_labels(dummy, labels = c("very low", "low", "mid", "hi")) frq(dummy)
#> #> x <integer> #> # total N=40 valid N=40 mean=2.75 sd=1.10 #> #> val label frq raw.prc valid.prc cum.prc #> 1 very low 8 20 20 20 #> 2 low 6 15 15 35 #> 3 mid 14 35 35 70 #> 4 hi 12 30 30 100 #> NA NA 0 0 NA NA #>
# assign labels with named vector dummy <- sample(1:4, 40, replace = TRUE) dummy <- set_labels(dummy, labels = c("very low" = 1, "very high" = 4)) frq(dummy)
#> #> x <integer> #> # total N=40 valid N=40 mean=2.48 sd=1.04 #> #> val label frq raw.prc valid.prc cum.prc #> 1 very low 8 20.0 20.0 20.0 #> 2 2 13 32.5 32.5 52.5 #> 3 3 11 27.5 27.5 80.0 #> 4 very high 8 20.0 20.0 100.0 #> NA NA 0 0.0 NA NA #>
# force using all labels, even if not all labels # have associated values in vector x <- c(2, 2, 3, 3, 2) # only two value labels x <- set_labels(x, labels = c("1", "2", "3"))
#> More labels than values of "x". Using first 2 labels.
x
#> [1] 2 2 3 3 2 #> attr(,"labels") #> 1 2 #> 2 3
frq(x)
#> #> x <numeric> #> # total N=5 valid N=5 mean=2.40 sd=0.55 #> #> val label frq raw.prc valid.prc cum.prc #> 2 1 3 60 60 60 #> 3 2 2 40 40 100 #> NA NA 0 0 NA NA #>
# all three value labels x <- set_labels(x, labels = c("1", "2", "3"), force.labels = TRUE) x
#> [1] 2 2 3 3 2 #> attr(,"labels") #> 1 2 3 #> 1 2 3
frq(x)
#> #> x <numeric> #> # total N=5 valid N=5 mean=2.40 sd=0.55 #> #> val label frq raw.prc valid.prc cum.prc #> 1 1 0 0 0 0 #> 2 2 3 60 60 60 #> 3 3 2 40 40 100 #> NA NA 0 0 NA NA #>
# create vector x <- c(1, 2, 3, 2, 4, NA) # add less labels than values x <- set_labels(x, labels = c("yes", "maybe", "no"), force.values = FALSE)
#> "x" has more values than "labels", hence not all values are labelled.
x
#> [1] 1 2 3 2 4 NA #> attr(,"labels") #> yes maybe no #> 1 2 3
# add all necessary labels x <- set_labels(x, labels = c("yes", "maybe", "no"), force.values = TRUE)
#> More values in "x" than length of "labels". Additional values were added to labels.
x
#> [1] 1 2 3 2 4 NA #> attr(,"labels") #> yes maybe no 4 #> 1 2 3 4
# set labels and missings x <- c(1, 1, 1, 2, 2, -2, 3, 3, 3, 3, 3, 9) x <- set_labels(x, labels = c("Refused", "One", "Two", "Three", "Missing")) x
#> [1] 1 1 1 2 2 -2 3 3 3 3 3 9 #> attr(,"labels") #> Refused One Two Three Missing #> -2 1 2 3 9
set_na(x, na = c(-2, 9))
#> [1] 1 1 1 2 2 NA 3 3 3 3 3 NA #> attr(,"labels") #> One Two Three #> 1 2 3
library(haven) x <- labelled( c(1:3, tagged_na("a", "c", "z"), 4:1), c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"), "Refused" = tagged_na("a"), "Not home" = tagged_na("z")) ) # get current NA values x
#> <Labelled double> #> [1] 1 2 3 NA(a) NA(c) NA(z) 4 3 2 1 #> #> Labels: #> value label #> 1 Agreement #> 4 Disagreement #> NA(c) First #> NA(a) Refused #> NA(z) Not home
#> First Refused Not home #> NA NA NA
# lose value labels from tagged NA by default, if not specified set_labels(x, labels = c("New Three" = 3))
#> <Labelled double> #> [1] 1 2 3 NA(a) NA(c) NA(z) 4 3 2 1 #> #> Labels: #> value label #> 3 New Three
# do not drop na set_labels(x, labels = c("New Three" = 3), drop.na = FALSE)
#> <Labelled double> #> [1] 1 2 3 NA(a) NA(c) NA(z) 4 3 2 1 #> #> Labels: #> value label #> 3 New Three #> NA(c) First #> NA(a) Refused #> NA(z) Not home
# set labels via named vector, # not using all possible values data(efc) get_labels(efc$e42dep)
#> [1] "independent" "slightly dependent" "moderately dependent" #> [4] "severely dependent"
x <- set_labels( efc$e42dep, labels = c(`independent` = 1, `severe dependency` = 2, `missing value` = 9) ) get_labels(x, values = "p")
#> [1] "[1] independent" "[2] severe dependency" "[9] missing value"
get_labels(x, values = "p", non.labelled = TRUE)
#> [1] "[1] independent" "[2] severe dependency" "[3] 3" #> [4] "[4] 4" "[9] missing value"
# labels can also be set for tagged NA value # create numeric vector x <- c(1, 2, 3, 4) # set 2 and 3 as missing, which will automatically set as # tagged NA by 'set_na()' x <- set_na(x, na = c(2, 3)) x
#> [1] 1 NA NA 4
# set label via named vector just for tagged NA(3) set_labels(x, labels = c(`New Value` = tagged_na("3")))
#> [1] 1 NA NA 4 #> attr(,"labels") #> New Value #> NA
# setting same value labels to multiple vectors dummies <- data.frame( dummy1 = sample(1:4, 40, replace = TRUE), dummy2 = sample(1:4, 40, replace = TRUE), dummy3 = sample(1:4, 40, replace = TRUE) ) # and set same value labels for two of three variables test <- set_labels( dummies, dummy1, dummy2, labels = c("very low", "low", "mid", "hi") ) # see result... get_labels(test)
#> $dummy1 #> [1] "very low" "low" "mid" "hi" #> #> $dummy2 #> [1] "very low" "low" "mid" "hi" #> #> $dummy3 #> NULL #>
# using quasi-quotation library(rlang) x1 <- "dummy1" x2 <- c("so low", "rather low", "mid", "very hi") dummies %>% val_labels( !!x1 := c("really low", "low", "a bit mid", "hi"), dummy3 = !!x2 ) %>% get_labels()
#> $dummy1 #> [1] "really low" "low" "a bit mid" "hi" #> #> $dummy2 #> NULL #> #> $dummy3 #> [1] "so low" "rather low" "mid" "very hi" #>
# ... and named vectors to explicetly set value labels x2 <- c("so low" = 4, "rather low" = 3, "mid" = 2, "very hi" = 1) dummies %>% val_labels( !!x1 := c("really low" = 1, "low" = 3, "a bit mid" = 2, "hi" = 4), dummy3 = !!x2 ) %>% get_labels(values = "p")
#> $dummy1 #> [1] "[1] really low" "[2] a bit mid" "[3] low" "[4] hi" #> #> $dummy2 #> NULL #> #> $dummy3 #> [1] "[1] very hi" "[2] mid" "[3] rather low" "[4] so low" #>