Skip to contents

Removes all rows and/or columns from a data.frame or matrix that are composed entirely of NA values.


remove_empty(dat, which = c("rows", "cols"), cutoff = 1, quiet = TRUE)



the input data.frame or matrix.


one of "rows", "cols", or c("rows", "cols"). Where no value of which is provided, defaults to removing both empty rows and empty columns, declaring the behavior with a printed message.


Under what fraction (>0 to <=1) of non-empty rows or columns should which be removed? Lower values keep more rows/columns, higher values drop more.


Should messages be suppressed (TRUE) or printed (FALSE) indicating the summary of empty columns or rows removed?


Returns the object without its missing rows or columns.

See also

remove_constant() for removing constant columns.

Other remove functions: remove_constant()


# not run:
# dat %>% remove_empty("rows")
# addressing a common untidy-data scenario where we have a mixture of
# blank values in some (character) columns and NAs in others:
dd <- tibble(
  x = c(LETTERS[1:5], NA, rep("", 2)),
  y = c(1:5, rep(NA, 3))
# remove_empty() drops row 5 (all NA) but not 6 and 7 (blanks + NAs)
dd %>% remove_empty("rows")
#> # A tibble: 7 × 2
#>   x         y
#>   <chr> <int>
#> 1 "A"       1
#> 2 "B"       2
#> 3 "C"       3
#> 4 "D"       4
#> 5 "E"       5
#> 6 ""       NA
#> 7 ""       NA
# solution: preprocess to convert whitespace/empty strings to NA,
# _then_ remove empty (all-NA) rows
dd %>%
  mutate(across(where(is.character), ~ na_if(trimws(.), ""))) %>%
#> # A tibble: 5 × 2
#>   x         y
#>   <chr> <int>
#> 1 A         1
#> 2 B         2
#> 3 C         3
#> 4 D         4
#> 5 E         5