Replace NA with specific values

This function replaces (tagged) NA's of a variable, data frame or list of variables with value.

replace_na(x, ..., value, na.label = NULL, tagged.na = NULL)

Arguments

x: A vector or data frame.
...: Optional, unquoted names of variables that should be selected for further processing. Required, if x is a data frame (and no vector) and only selected variables from x should be processed. You may also use functions like : or tidyselect's select-helpers. See 'Examples' or package-vignette.
value: Value that will replace the NA's.
na.label: Optional character vector, used to label the the former NA-value (i.e. adding a labels attribute for value to x).
tagged.na: Optional single character, specifies a tagged_na value that will be replaced by value. Herewith it is possible to replace only specific NA values of x.

Value

x, where NA's are replaced with value. If x is a data frame, the complete data frame x will be returned, with replaced NA's for variables specified in ...; if ... is not specified, applies to all variables in the data frame.

Details

While regular NA values can only be completely replaced with a single value, tagged_na allows to differentiate between different qualitative values of NAs. Tagged NAs work exactly like regular R missing values except that they store one additional byte of information: a tag, which is usually a letter ("a" to "z") or character number ("0" to "9"). Therewith it is possible to replace only specific NA values, while other NA values are preserved.

Note

Value and variable label attributes are preserved.

Examples

library(sjlabelled)
data(efc)
table(efc$e42dep, useNA = "always")
#> 
#>    1    2    3    4 <NA> 
#>   66  225  306  304    7 
table(replace_na(efc$e42dep, value = 99), useNA = "always")
#> 
#>    1    2    3    4   99 <NA> 
#>   66  225  306  304    7    0 

# the original labels
get_labels(replace_na(efc$e42dep, value = 99))
#> [1] "independent"          "slightly dependent"   "moderately dependent"
#> [4] "severely dependent"  
# NA becomes "99", and is labelled as "former NA"
get_labels(
  replace_na(efc$e42dep, value = 99, na.label = "former NA"),
  values = "p"
)
#> [1] "[1] independent"          "[2] slightly dependent"  
#> [3] "[3] moderately dependent" "[4] severely dependent"  
#> [5] "[99] former NA"          

dummy <- data.frame(
  v1 = efc$c82cop1,
  v2 = efc$c83cop2,
  v3 = efc$c84cop3
)
# show original distribution
lapply(dummy, table, useNA = "always")
#> $v1
#> 
#>    1    2    3    4 <NA> 
#>    3   97  591  210    7 
#> 
#> $v2
#> 
#>    1    2    3    4 <NA> 
#>  186  547  130   39    6 
#> 
#> $v3
#> 
#>    1    2    3    4 <NA> 
#>  516  252   82   52    6 
#> 
# show variables, NA's replaced with 99
lapply(replace_na(dummy, v2, v3, value = 99), table, useNA = "always")
#> $v1
#> 
#>    1    2    3    4 <NA> 
#>    3   97  591  210    7 
#> 
#> $v2
#> 
#>    1    2    3    4   99 <NA> 
#>  186  547  130   39    6    0 
#> 
#> $v3
#> 
#>    1    2    3    4   99 <NA> 
#>  516  252   82   52    6    0 
#> 

if (require("haven")) {
  x <- labelled(c(1:3, tagged_na("a", "c", "z"), 4:1),
                c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
                  "Refused" = tagged_na("a"), "Not home" = tagged_na("z")))
  # get current NA values
  x
  get_na(x)

  # replace only the NA, which is tagged as NA(c)
  replace_na(x, value = 2, tagged.na = "c")
  get_na(replace_na(x, value = 2, tagged.na = "c"))

  table(x)
  table(replace_na(x, value = 2, tagged.na = "c"))

  # tagged NA also works for non-labelled class
  # init vector
  x <- c(1, 2, 3, 4)
  # set values 2 and 3 as tagged NA
  x <- set_na(x, na = c(2, 3), as.tag = TRUE)
  # see result
  x
  # now replace only NA tagged with 2 with value 5
  replace_na(x, value = 5, tagged.na = "2")
}
#> [1]  1  5 NA  4

Arguments

Value

Details

Note

See also

Examples