This function replaces (tagged) NA's of a variable, data frame
or list of variables with value
.
replace_na(x, ..., value, na.label = NULL, tagged.na = NULL)
A vector or data frame.
Optional, unquoted names of variables that should be selected for
further processing. Required, if x
is a data frame (and no
vector) and only selected variables from x
should be processed.
You may also use functions like :
or tidyselect's
select-helpers.
See 'Examples' or package-vignette.
Value that will replace the NA
's.
Optional character vector, used to label the the former NA-value
(i.e. adding a labels
attribute for value
to x
).
Optional single character, specifies a tagged_na
value
that will be replaced by value
. Herewith it is possible
to replace only specific NA
values of x
.
x
, where NA
's are replaced with value
. If x
is a data frame, the complete data frame x
will be returned,
with replaced NA's for variables specified in ...
;
if ...
is not specified, applies to all variables in the
data frame.
While regular NA
values can only be completely replaced with
a single value, tagged_na
allows to differentiate
between different qualitative values of NA
s.
Tagged NA
s work exactly like regular R missing values
except that they store one additional byte of information: a tag,
which is usually a letter ("a" to "z") or character number ("0" to "9").
Therewith it is possible to replace only specific NA values, while
other NA values are preserved.
Value and variable label attributes are preserved.
library(sjlabelled)
data(efc)
table(efc$e42dep, useNA = "always")
#>
#> 1 2 3 4 <NA>
#> 66 225 306 304 7
table(replace_na(efc$e42dep, value = 99), useNA = "always")
#>
#> 1 2 3 4 99 <NA>
#> 66 225 306 304 7 0
# the original labels
get_labels(replace_na(efc$e42dep, value = 99))
#> [1] "independent" "slightly dependent" "moderately dependent"
#> [4] "severely dependent"
# NA becomes "99", and is labelled as "former NA"
get_labels(
replace_na(efc$e42dep, value = 99, na.label = "former NA"),
values = "p"
)
#> [1] "[1] independent" "[2] slightly dependent"
#> [3] "[3] moderately dependent" "[4] severely dependent"
#> [5] "[99] former NA"
dummy <- data.frame(
v1 = efc$c82cop1,
v2 = efc$c83cop2,
v3 = efc$c84cop3
)
# show original distribution
lapply(dummy, table, useNA = "always")
#> $v1
#>
#> 1 2 3 4 <NA>
#> 3 97 591 210 7
#>
#> $v2
#>
#> 1 2 3 4 <NA>
#> 186 547 130 39 6
#>
#> $v3
#>
#> 1 2 3 4 <NA>
#> 516 252 82 52 6
#>
# show variables, NA's replaced with 99
lapply(replace_na(dummy, v2, v3, value = 99), table, useNA = "always")
#> $v1
#>
#> 1 2 3 4 <NA>
#> 3 97 591 210 7
#>
#> $v2
#>
#> 1 2 3 4 99 <NA>
#> 186 547 130 39 6 0
#>
#> $v3
#>
#> 1 2 3 4 99 <NA>
#> 516 252 82 52 6 0
#>
if (require("haven")) {
x <- labelled(c(1:3, tagged_na("a", "c", "z"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
"Refused" = tagged_na("a"), "Not home" = tagged_na("z")))
# get current NA values
x
get_na(x)
# replace only the NA, which is tagged as NA(c)
replace_na(x, value = 2, tagged.na = "c")
get_na(replace_na(x, value = 2, tagged.na = "c"))
table(x)
table(replace_na(x, value = 2, tagged.na = "c"))
# tagged NA also works for non-labelled class
# init vector
x <- c(1, 2, 3, 4)
# set values 2 and 3 as tagged NA
x <- set_na(x, na = c(2, 3), as.tag = TRUE)
# see result
x
# now replace only NA tagged with 2 with value 5
replace_na(x, value = 5, tagged.na = "2")
}
#> [1] 1 5 NA 4