Add or replace data frame columns — add

add_columns() combines two or more data frames, but unlike cbind or dplyr::bind_cols(), this function binds data as last columns of a data frame (i.e., behind columns specified in ...). This can be useful in a "pipe"-workflow, where a data frame returned by a previous function should be appended at the end of another data frame that is processed in add_colums().

replace_columns() replaces all columns in data with identically named columns in ..., and adds remaining (non-duplicated) columns from ... to data.

add_id() simply adds an ID-column to the data frame, with values from 1 to nrow(data), respectively for grouped data frames, values from 1 to group size. See 'Examples'.

add_columns(data, ..., replace = TRUE)

replace_columns(data, ..., add.unique = TRUE)

add_id(data, var = "ID")

Arguments

data: A data frame. For add_columns(), will be bound after data frames specified in .... For replace_columns(), duplicated columns in data will be replaced by columns in ....
...: More data frames to combine, resp. more data frames with columns that should replace columns in data.
replace: Logical, if TRUE (default), columns in ... with identical names in data will replace the columns in data. The order of columns after replacing is preserved.
add.unique: Logical, if TRUE (default), remaining columns in ... that did not replace any column in data, are appended as new columns to data.
var: Name of new the ID-variable.

Value

For add_columns(), a data frame, where columns of data are appended after columns of ....

For replace_columns(), a data frame where columns in data will be replaced by identically named columns in ..., and remaining columns from ... will be appended to data (if add.unique = TRUE).

For add_id(), a new column with ID numbers. This column is always the first column in the returned data frame.

Note

For add_columns(), by default, columns in data with identical names like columns in one of the data frames in ... will be dropped (i.e. variables with identical names in ... will replace existing variables in data). Use replace = FALSE to keep all columns. Identical column names will then be renamed, to ensure unique column names (which happens by default when using dplyr::bind_cols()). When replacing columns, replaced columns are not added to the end of the data frame. Rather, the original order of columns will be preserved.

Examples

data(efc)
d1 <- efc[, 1:3]
d2 <- efc[, 4:6]

library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
head(bind_cols(d1, d2))
#>   c12hour e15relat e16sex e17age e42dep c82cop1
#> 1      16        2      2     83      3       3
#> 2     148        2      2     88      3       3
#> 3      70        1      2     82      3       2
#> 4     168        1      2     67      4       4
#> 5     168        2      2     84      4       3
#> 6      16        2      2     85      4       2
add_columns(d1, d2) %>% head()
#>   e17age e42dep c82cop1 c12hour e15relat e16sex
#> 1     83      3       3      16        2      2
#> 2     88      3       3     148        2      2
#> 3     82      3       2      70        1      2
#> 4     67      4       4     168        1      2
#> 5     84      4       3     168        2      2
#> 6     85      4       2      16        2      2

d1 <- efc[, 1:3]
d2 <- efc[, 2:6]

add_columns(d1, d2, replace = TRUE) %>% head()
#>   c12hour e15relat e17age e42dep c82cop1 e16sex
#> 1      16        2     83      3       3      2
#> 2     148        2     88      3       3      2
#> 3      70        1     82      3       2      2
#> 4     168        1     67      4       4      2
#> 5     168        2     84      4       3      2
#> 6      16        2     85      4       2      2
add_columns(d1, d2, replace = FALSE) %>% head()
#> New names:
#> • `e15relat` -> `e15relat...1`
#> • `e16sex` -> `e16sex...2`
#> • `e15relat` -> `e15relat...7`
#> • `e16sex` -> `e16sex...8`
#>   e15relat...1 e16sex...2 e17age e42dep c82cop1 c12hour e15relat...7 e16sex...8
#> 1            2          2     83      3       3      16            2          2
#> 2            2          2     88      3       3     148            2          2
#> 3            1          2     82      3       2      70            1          2
#> 4            1          2     67      4       4     168            1          2
#> 5            2          2     84      4       3     168            2          2
#> 6            2          2     85      4       2      16            2          2

# use case: we take the original data frame, select specific
# variables and do some transformations or recodings
# (standardization in this example) and add the new, transformed
# variables *to the end* of the original data frame
efc %>%
  select(e17age, c160age) %>%
  std() %>%
  add_columns(efc) %>%
  head()
#>   c12hour e15relat e16sex e17age e42dep c82cop1 c83cop2 c84cop3 c85cop4 c86cop5
#> 1      16        2      2     83      3       3       2       2       2       1
#> 2     148        2      2     88      3       3       3       3       3       4
#> 3      70        1      2     82      3       2       2       1       4       1
#> 4     168        1      2     67      4       4       1       3       1       1
#> 5     168        2      2     84      4       3       2       1       2       2
#> 6      16        2      2     85      4       2       2       3       3       3
#>   c87cop6 c88cop7 c89cop8 c90cop9 c160age c161sex c172code c175empl barthtot
#> 1       1       2       3       3      56       2        2        1       75
#> 2       1       3       2       2      54       2        2        1       75
#> 3       1       1       4       3      80       1        1        0       35
#> 4       1       1       2       4      69       1        2        0        0
#> 5       2       1       4       4      47       2        2        0       25
#> 6       2       2       1       1      56       1        2        1       60
#>   neg_c_7 pos_v_4 quol_5 resttotn tot_sc_e n4pstu nur_pst   e17age_z
#> 1      12      12     14        0        4      0      NA  0.4791992
#> 2      20      11     10        4        0      0      NA  1.0969170
#> 3      11      13      7        0        1      2       2  0.3556557
#> 4      10      15     12        2        0      3       3 -1.4974976
#> 5      12      15     19        2        1      2       2  0.6027428
#> 6      19       9      8        1        3      2       2  0.7262864
#>     c160age_z
#> 1  0.19002509
#> 2  0.04023278
#> 3  1.98753275
#> 4  1.16367507
#> 5 -0.48404028
#> 6  0.19002509

# new variables with same name will overwrite old variables
# in "efc". order of columns is not changed.
efc %>%
  select(e16sex, e42dep) %>%
  to_factor() %>%
  add_columns(efc) %>%
  head()
#>   c12hour e15relat e16sex e17age e42dep c82cop1 c83cop2 c84cop3 c85cop4 c86cop5
#> 1      16        2      2     83      3       3       2       2       2       1
#> 2     148        2      2     88      3       3       3       3       3       4
#> 3      70        1      2     82      3       2       2       1       4       1
#> 4     168        1      2     67      4       4       1       3       1       1
#> 5     168        2      2     84      4       3       2       1       2       2
#> 6      16        2      2     85      4       2       2       3       3       3
#>   c87cop6 c88cop7 c89cop8 c90cop9 c160age c161sex c172code c175empl barthtot
#> 1       1       2       3       3      56       2        2        1       75
#> 2       1       3       2       2      54       2        2        1       75
#> 3       1       1       4       3      80       1        1        0       35
#> 4       1       1       2       4      69       1        2        0        0
#> 5       2       1       4       4      47       2        2        0       25
#> 6       2       2       1       1      56       1        2        1       60
#>   neg_c_7 pos_v_4 quol_5 resttotn tot_sc_e n4pstu nur_pst
#> 1      12      12     14        0        4      0      NA
#> 2      20      11     10        4        0      0      NA
#> 3      11      13      7        0        1      2       2
#> 4      10      15     12        2        0      3       3
#> 5      12      15     19        2        1      2       2
#> 6      19       9      8        1        3      2       2

# keep both old and new variables, automatically
# rename variables with identical name
efc %>%
  select(e16sex, e42dep) %>%
  to_factor() %>%
  add_columns(efc, replace = FALSE) %>%
  head()
#> New names:
#> • `e16sex` -> `e16sex...3`
#> • `e42dep` -> `e42dep...5`
#> • `e16sex` -> `e16sex...27`
#> • `e42dep` -> `e42dep...28`
#>   c12hour e15relat e16sex...3 e17age e42dep...5 c82cop1 c83cop2 c84cop3 c85cop4
#> 1      16        2          2     83          3       3       2       2       2
#> 2     148        2          2     88          3       3       3       3       3
#> 3      70        1          2     82          3       2       2       1       4
#> 4     168        1          2     67          4       4       1       3       1
#> 5     168        2          2     84          4       3       2       1       2
#> 6      16        2          2     85          4       2       2       3       3
#>   c86cop5 c87cop6 c88cop7 c89cop8 c90cop9 c160age c161sex c172code c175empl
#> 1       1       1       2       3       3      56       2        2        1
#> 2       4       1       3       2       2      54       2        2        1
#> 3       1       1       1       4       3      80       1        1        0
#> 4       1       1       1       2       4      69       1        2        0
#> 5       2       2       1       4       4      47       2        2        0
#> 6       3       2       2       1       1      56       1        2        1
#>   barthtot neg_c_7 pos_v_4 quol_5 resttotn tot_sc_e n4pstu nur_pst e16sex...27
#> 1       75      12      12     14        0        4      0      NA           2
#> 2       75      20      11     10        4        0      0      NA           2
#> 3       35      11      13      7        0        1      2       2           2
#> 4        0      10      15     12        2        0      3       3           2
#> 5       25      12      15     19        2        1      2       2           2
#> 6       60      19       9      8        1        3      2       2           2
#>   e42dep...28
#> 1           3
#> 2           3
#> 3           3
#> 4           4
#> 5           4
#> 6           4

# create sample data frames
d1 <- efc[, 1:10]
d2 <- efc[, 2:3]
d3 <- efc[, 7:8]
d4 <- efc[, 10:12]

# show original
head(d1)
#>   c12hour e15relat e16sex e17age e42dep c82cop1 c83cop2 c84cop3 c85cop4 c86cop5
#> 1      16        2      2     83      3       3       2       2       2       1
#> 2     148        2      2     88      3       3       3       3       3       4
#> 3      70        1      2     82      3       2       2       1       4       1
#> 4     168        1      2     67      4       4       1       3       1       1
#> 5     168        2      2     84      4       3       2       1       2       2
#> 6      16        2      2     85      4       2       2       3       3       3

library(sjlabelled)
#> 
#> Attaching package: ‘sjlabelled’
#> The following object is masked from ‘package:dplyr’:
#> 
#>     as_label
# slightly change variables, to see effect
d2 <- as_label(d2)
d3 <- as_label(d3)

# replace duplicated columns, append remaining
replace_columns(d1, d2, d3, d4) %>% head()
#>   c12hour       e15relat e16sex e17age e42dep c82cop1   c83cop2   c84cop3
#> 1      16          child female     83      3       3 Sometimes Sometimes
#> 2     148          child female     88      3       3     Often     Often
#> 3      70 spouse/partner female     82      3       2 Sometimes     Never
#> 4     168 spouse/partner female     67      4       4     Never     Often
#> 5     168          child female     84      4       3 Sometimes     Never
#> 6      16          child female     85      4       2 Sometimes     Often
#>   c85cop4 c86cop5 c87cop6 c88cop7
#> 1       2       1       1       2
#> 2       3       4       1       3
#> 3       4       1       1       1
#> 4       1       1       1       1
#> 5       2       2       2       1
#> 6       3       3       2       2

# replace duplicated columns, omit remaining
replace_columns(d1, d2, d3, d4, add.unique = FALSE) %>% head()
#>   c12hour       e15relat e16sex e17age e42dep c82cop1   c83cop2   c84cop3
#> 1      16          child female     83      3       3 Sometimes Sometimes
#> 2     148          child female     88      3       3     Often     Often
#> 3      70 spouse/partner female     82      3       2 Sometimes     Never
#> 4     168 spouse/partner female     67      4       4     Never     Often
#> 5     168          child female     84      4       3 Sometimes     Never
#> 6      16          child female     85      4       2 Sometimes     Often
#>   c85cop4 c86cop5
#> 1       2       1
#> 2       3       4
#> 3       4       1
#> 4       1       1
#> 5       2       2
#> 6       3       3

# add ID to dataset
library(dplyr)
data(mtcars)
add_id(mtcars)
#>                     ID  mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4            1 21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag        2 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710           3 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive       4 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout    5 18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
#> Valiant              6 18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
#> Duster 360           7 14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
#> Merc 240D            8 24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
#> Merc 230             9 22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2
#> Merc 280            10 19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4
#> Merc 280C           11 17.8   6 167.6 123 3.92 3.440 18.90  1  0    4    4
#> Merc 450SE          12 16.4   8 275.8 180 3.07 4.070 17.40  0  0    3    3
#> Merc 450SL          13 17.3   8 275.8 180 3.07 3.730 17.60  0  0    3    3
#> Merc 450SLC         14 15.2   8 275.8 180 3.07 3.780 18.00  0  0    3    3
#> Cadillac Fleetwood  15 10.4   8 472.0 205 2.93 5.250 17.98  0  0    3    4
#> Lincoln Continental 16 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
#> Chrysler Imperial   17 14.7   8 440.0 230 3.23 5.345 17.42  0  0    3    4
#> Fiat 128            18 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
#> Honda Civic         19 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
#> Toyota Corolla      20 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
#> Toyota Corona       21 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1
#> Dodge Challenger    22 15.5   8 318.0 150 2.76 3.520 16.87  0  0    3    2
#> AMC Javelin         23 15.2   8 304.0 150 3.15 3.435 17.30  0  0    3    2
#> Camaro Z28          24 13.3   8 350.0 245 3.73 3.840 15.41  0  0    3    4
#> Pontiac Firebird    25 19.2   8 400.0 175 3.08 3.845 17.05  0  0    3    2
#> Fiat X1-9           26 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
#> Porsche 914-2       27 26.0   4 120.3  91 4.43 2.140 16.70  0  1    5    2
#> Lotus Europa        28 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
#> Ford Pantera L      29 15.8   8 351.0 264 4.22 3.170 14.50  0  1    5    4
#> Ferrari Dino        30 19.7   6 145.0 175 3.62 2.770 15.50  0  1    5    6
#> Maserati Bora       31 15.0   8 301.0 335 3.54 3.570 14.60  0  1    5    8
#> Volvo 142E          32 21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

mtcars %>%
  group_by(gear) %>%
  add_id() %>%
  arrange(gear, ID) %>%
  print(n = 100)
#> # A tibble: 32 × 12
#>       ID   mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb
#>    <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1     1  21.4     6 258     110  3.08  3.22  19.4     1     0     3     1
#>  2     2  18.7     8 360     175  3.15  3.44  17.0     0     0     3     2
#>  3     3  18.1     6 225     105  2.76  3.46  20.2     1     0     3     1
#>  4     4  14.3     8 360     245  3.21  3.57  15.8     0     0     3     4
#>  5     5  16.4     8 276.    180  3.07  4.07  17.4     0     0     3     3
#>  6     6  17.3     8 276.    180  3.07  3.73  17.6     0     0     3     3
#>  7     7  15.2     8 276.    180  3.07  3.78  18       0     0     3     3
#>  8     8  10.4     8 472     205  2.93  5.25  18.0     0     0     3     4
#>  9     9  10.4     8 460     215  3     5.42  17.8     0     0     3     4
#> 10    10  14.7     8 440     230  3.23  5.34  17.4     0     0     3     4
#> 11    11  21.5     4 120.     97  3.7   2.46  20.0     1     0     3     1
#> 12    12  15.5     8 318     150  2.76  3.52  16.9     0     0     3     2
#> 13    13  15.2     8 304     150  3.15  3.44  17.3     0     0     3     2
#> 14    14  13.3     8 350     245  3.73  3.84  15.4     0     0     3     4
#> 15    15  19.2     8 400     175  3.08  3.84  17.0     0     0     3     2
#> 16     1  21       6 160     110  3.9   2.62  16.5     0     1     4     4
#> 17     2  21       6 160     110  3.9   2.88  17.0     0     1     4     4
#> 18     3  22.8     4 108      93  3.85  2.32  18.6     1     1     4     1
#> 19     4  24.4     4 147.     62  3.69  3.19  20       1     0     4     2
#> 20     5  22.8     4 141.     95  3.92  3.15  22.9     1     0     4     2
#> 21     6  19.2     6 168.    123  3.92  3.44  18.3     1     0     4     4
#> 22     7  17.8     6 168.    123  3.92  3.44  18.9     1     0     4     4
#> 23     8  32.4     4  78.7    66  4.08  2.2   19.5     1     1     4     1
#> 24     9  30.4     4  75.7    52  4.93  1.62  18.5     1     1     4     2
#> 25    10  33.9     4  71.1    65  4.22  1.84  19.9     1     1     4     1
#> 26    11  27.3     4  79      66  4.08  1.94  18.9     1     1     4     1
#> 27    12  21.4     4 121     109  4.11  2.78  18.6     1     1     4     2
#> 28     1  26       4 120.     91  4.43  2.14  16.7     0     1     5     2
#> 29     2  30.4     4  95.1   113  3.77  1.51  16.9     1     1     5     2
#> 30     3  15.8     8 351     264  4.22  3.17  14.5     0     1     5     4
#> 31     4  19.7     6 145     175  3.62  2.77  15.5     0     1     5     6
#> 32     5  15       8 301     335  3.54  3.57  14.6     0     1     5     8