Skip to contents

Cleaning data

Cleaning variable names

clean_names()
Cleans names of an object (usually a data.frame).
make_clean_names()
Cleans a vector of text, typically containing the names of an object.

Exploring data

tabyls are an enhanced version of tables. See vignette("tabyls") for more details.

tabyl()
Generate a frequency table (1-, 2-, or 3-way).
adorn_ns()
Add underlying Ns to a tabyl displaying percentages.
adorn_pct_formatting()
Format a data.frame of decimals as percentages.
adorn_percentages()
Convert a data.frame of counts to percentages.
adorn_rounding()
Round the numeric columns in a data.frame.
adorn_title()
Add column name to the top of a two-way tabyl.
adorn_totals()
Append a totals row and/or column to a data.frame
as_tabyl()
Add tabyl attributes to a data.frame
untabyl()
Remove tabyl attributes from a data.frame.

Change order

row_to_names()
Elevate a row to be the column names of a data.frame.
find_header()
Find the header row in a data.frame

Comparison

Compare data frames columns

compare_df_cols()
Compare data frames columns before merging
compare_df_cols_same()
Do the the data.frames have the same columns & types?

Removing unnecessary columns / rows

remove_constant()
Remove constant columns from a data.frame or matrix.
remove_empty()
Remove empty rows and/or columns from a data.frame or matrix.
get_dupes()
Get rows of a data.frame with identical values for the specified variables.
get_one_to_one()
Find the list of columns that have a 1:1 mapping to each other
top_levels()
Generate a frequency table of a factor grouped into top-n, bottom-n, and all other levels.
single_value()
Ensure that a vector has only a single value throughout.

Rounding / dates helpers

Help to mimic some behaviour from Excel or SAS. These should be used on vector.

round_half_up()
Round a numeric vector; halves will be rounded up, ala Microsoft Excel.
signif_half_up()
Round a numeric vector to the specified number of significant digits; halves will be rounded up.
round_to_fraction()
Round to the nearest fraction of a specified denominator.
excel_numeric_to_date()
Convert dates encoded as serial numbers to Date class.
sas_numeric_to_date()
Convert a SAS date, time or date/time to an R object
excel_time_to_numeric()
Convert a time that may be inconsistently or inconveniently formatted from Microsoft Excel to a numeric number of seconds between 0 and 86400.
convert_to_date() convert_to_datetime()
Parse dates from many formats

Misc / helpers

These functions can help perform less frequent operations.

describe_class()
Describe the class(es) of an object
paste_skip_na()
Like paste(), but missing values are omitted
chisq.test()
Apply stats::chisq.test() to a two-way tabyl
fisher.test()
Apply stats::fisher.test() to a two-way tabyl
mu_to_u
Constant to help map from mu to u