This is a method for the dplyr::filter() generic.
See "Fallbacks" section for differences in implementation.
The filter() function is used to subset a data frame,
retaining all rows that satisfy your conditions.
To be retained, the row must produce a value of TRUE for all conditions.
Note that when a condition evaluates to NA the row will be dropped,
unlike base subsetting with [.
Usage
# S3 method for class 'duckplyr_df'
filter(.data, ..., .by = NULL, .preserve = FALSE)Arguments
- .data
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.
- ...
<
data-masking> Expressions that return a logical value, and are defined in terms of the variables in.data. If multiple expressions are included, they are combined with the&operator. Only rows for which all conditions evaluate toTRUEare kept.- .by
-
<
tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative togroup_by(). For details and examples, see ?dplyr_by. - .preserve
Relevant when the
.datainput is grouped. If.preserve = FALSE(the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.
Fallbacks
There is no DuckDB translation in filter.duckplyr_df()
with no filter conditions,
nor for a grouped operation (if
.byis set).
These features fall back to dplyr::filter(), see vignette("fallback") for details.
Examples
df <- duckdb_tibble(x = 1:3, y = 3:1)
filter(df, x >= 2)
#> # A duckplyr data frame: 2 variables
#> x y
#> <int> <int>
#> 1 2 2
#> 2 3 1
