Software
textrecipes

textrecipes

Extra recipes for Text Processing

R

textrecipes is an R package that provides preprocessing steps for text data within the recipes framework from the tidymodels ecosystem. It extends the recipes package with specialized functions for handling character variables in machine learning workflows.

The package offers a consistent, pipeable interface for common text preprocessing tasks like tokenization, stopword removal, token filtering, and TF-IDF transformation. It integrates seamlessly with the tidymodels suite, allowing text data to be preprocessed alongside other variable types in a unified recipe. The package supports multiple text processing operations through modular steps that can be chained together, making it straightforward to build reproducible text preprocessing pipelines for modeling.

Contributors