Introduction

sjmisc-package

Data and Variable Transformation Functions

Recode and Data Transformation Functions

rec() rec_if()

Recode variables

recode_to() recode_to_if()

Recode variable categories into new values

dicho() dicho_if()

Dichotomize variables

group_var() group_var_if() group_labels() group_labels_if()

Recode numeric variables into equal-ranged groups

group_str()

Group near elements of string vectors

split_var() split_var_if()

Split numeric variables into smaller groups

std() std_if() center() center_if()

Standardize and center variables

Data Frame Transformation Functions

merge_imputations()

Merges multiple imputed data frames into a single data frame

rotate_df()

Rotate a data frame

Adding, Replacing and Removing Variables or Cases from Data Frames

add_columns() replace_columns() add_id()

Add or replace data frame columns

add_rows() merge_df()

Merge labelled data frames

add_variables() add_case()

Add variables or cases to data frames

move_columns()

Move columns to other positions in a data frame

empty_cols() empty_rows() remove_empty_cols() remove_empty_rows()

Return or remove variables or observations that are completely missing

remove_var() remove_cols()

Remove variables from a data frame

var_rename() rename_variables() rename_columns()

Rename variables

Variable Transformation

numeric_to_factor()

Convert numeric vectors into factors associated value labels

reexports

Objects exported from other packages

to_value()

Convert factors to numeric variables

Creating New Variables

de_mean()

Compute group-meaned and de-meaned variables

to_dummy()

Split (categorical) vectors into dummy variables

Variable Utility Functions

ref_lvl()

Change reference level of (numeric) factors

round_num()

Round numeric variables in a data frame

shorten_string()

Shorten character strings

zap_inf()

Convert infiite or NaN values into regular NA

Missing Values

replace_na()

Replace NA with specific values

set_na_if()

Replace specific values in vector with NA

Finding Variables and Values

find_var() find_in_data()

Find variable by name or label

str_contains()

Check if string contains pattern

str_start() str_end()

Find start and end index of pattern in string

str_find()

Find partial matching and close distance elements in strings

Descriptive and Summary Statistics

count_na()

Frequency table of tagged NA values

descr()

Basic descriptive statistics

flat_table()

Flat (proportional) tables

frq()

Frequency table of labelled variables

row_count() col_count()

Count row or column indices

row_sums() row_means() total_mean()

Row sums and means for data frames

Properties of Data Frames or Variables

all_na()

Check if vector only has NA values

empty_cols() empty_rows() remove_empty_cols() remove_empty_rows()

Return or remove variables or observations that are completely missing

has_na() incomplete_cases() complete_cases() complete_vars() incomplete_vars()

Check if variables or cases have missing / infinite values

Check Variables

is_crossed() is_nested() is_cross_classified()

Check whether two factors are crossed or nested

is_empty()

Check whether string, list or vector is empty

is_even() is_odd()

Check whether value is even or odd

is_float() is_whole()

Check if a variable is of (non-integer) double type or a whole number

is_num_fac() is_num_chr()

Check whether a factor has numeric levels only

Utility Functions

`%nin%`

Value matching

big_mark() prcn()

Format numbers

rec_pattern()

Create recode pattern for 'rec' function

reshape_longer()

Reshape data into long format

seq_col() seq_row()

Sequence generation for column or row counts of data frames

spread_coef()

Spread model coefficients of list-variables into columns

tidy_values() clean_values()

Clean values of character vectors.

to_long()

Convert wide data to long format

trim()

Trim leading and trailing whitespaces from strings

typical_value()

Return the typical value of a vector

var_type()

Determine variable type

word_wrap()

Insert line breaks in long labels

Sample Data

efc

Sample dataset from the EUROFAMCARE project