Data preparation is a common task in research, which usually takes the most amount of time in the analytical process. Packages for data preparation have been released recently as part of the tidyverse, focussing on the transformation of data sets. Packages with special focus on transformation of variables, which fit into the workflow and design-philosophy of the tidyverse, are missing.
sjmisc tries to fill this gap. Basically, this package complements the dplyr package in that sjmisc takes over data transformation tasks on variables, like recoding, dichotomizing or grouping variables, setting and replacing missing values, etc. A distinctive feature of sjmisc is the support for labelled data, which is especially useful for users who often work with data sets from other statistical software packages like SPSS or Stata.
The functions of sjmisc are designed to work together seamlessly with other packes from the tidyverse, like dplyr. For instance, you can use the functions from sjmisc both within a pipe-workflow to manipulate data frames, or to create new variables with
vignette("design_philosophy", "sjmisc") for more details.
Please follow this guide if you like to contribute to this package.
To install the latest development snapshot (see latest changes below), type following commands into the R console:
Please note the package dependencies when installing from GitHub. The GitHub version of this package may depend on latest GitHub versions of my other packages, so you may need to install those first, if you encounter any problems. Here’s the order for installing packages from GitHub:
For more examples, see package vignettes (
Please visit https://strengejacke.github.io/sjmisc/ for documentation and vignettes.
In case you want / have to cite my package, please cite as (see also
Lüdecke D (2018). sjmisc: Data and Variable Transformation Functions. Journal of Open Source Software, 3(26), 754. doi: 10.21105/joss.00754