

dplyr
dplyr: A grammar of data manipulation
dplyr is an R package that provides a grammar of data manipulation with a consistent set of verbs for common data tasks: filtering rows, selecting columns, creating new variables, sorting data, and computing summaries. These operations work naturally with grouping to perform calculations by category.
The package handles multiple computational backends beyond standard data frames, translating your code to work efficiently with databases (via SQL), large in-memory datasets (via data.table or DuckDB), cloud storage (via Apache Arrow), and distributed systems (via Apache Spark). This backend flexibility lets you use the same dplyr syntax whether your data fits in memory or requires specialized storage systems. The package integrates with other tidyverse tools for end-to-end data analysis workflows.
Contributors

Hadley Wickham

Lionel Henry

Davis Vaughan

Mine Çetinkaya-Rundel

Jenny Bryan

Christophe Dervieux

Gábor Csárdi

Simon Couch

Neal Richardson

Tomasz Kalinowski

Charlotte Wickham

Carson Sievert

Barret Schloerke

Jeroen Janssens


