tidyverse-friendly functions for counting things in R. Formerly part of the janitor package.
Installation
You can install the development version of tabyl from GitHub with:
# install.packages("pak")
pak::pak("sfirke/tabyl")Exploring
Tabulating tools
A variable (or combinations of two or three variables) can be tabulated with tabyl(). The resulting data.frame can be tweaked and formatted with the suite of adorn_ functions for quick analysis and printing of pretty results in a report. adorn_ functions can be helpful with non-tabyls, too.
tabyl()
Like table(), but pipe-able, data.frame-based, and fully featured.
tabyl() can be called two ways:
- On a vector, when tabulating a single variable:
tabyl(roster$subject) - On a data.frame, specifying 1, 2, or 3 variable names to tabulate:
roster %>% tabyl(subject, employee_status).- Here the data.frame is passed in with the
%>%pipe; this allowstabylto be used in an analysis pipeline
- Here the data.frame is passed in with the
One variable: {r} roster %>% tabyl(subject)
Two variables: {r} roster %>% filter(hire_date > as.Date("1950-01-01")) %>% tabyl(employee_status, full_time)
Three variables: {r} roster %>% tabyl(full_time, subject, employee_status, show_missing_levels = FALSE)
Adorning tabyls
The adorn_ functions dress up the results of these tabulation calls for fast, basic reporting. Here are some of the functions that augment a summary table for reporting:
{r} roster %>% tabyl(employee_status, full_time) %>% adorn_totals("row") %>% adorn_percentages("row") %>% adorn_pct_formatting() %>% adorn_ns() %>% adorn_title("combined")
Pipe that right into knitr::kable() in your RMarkdown report.
These modular adornments can be layered to reduce R’s deficit against Excel and SPSS when it comes to quick, informative counts. Learn more about tabyl() and the adorn_ functions from the tabyls vignette.
Count factor levels in groups of high, medium, and low with top_levels()
Originally designed for use with Likert survey data stored as factors. Returns a tbl_df frequency table with appropriately-named rows, grouped into head/middle/tail groups.
- Takes a user-specified size for the head/tail groups
- Automatically calculates a percent column
- Supports sorting
- Can show or hide
NAvalues.
{r} f <- factor(c("strongly agree", "agree", "neutral", "neutral", "disagree", "strongly agree"), levels = c("strongly agree", "agree", "neutral", "disagree", "strongly disagree") ) top_levels(f) top_levels(f, n = 1)