This is a method for the dplyr::filter()
generic.
See "Fallbacks" section for differences in implementation.
The filter()
function is used to subset a data frame,
retaining all rows that satisfy your conditions.
To be retained, the row must produce a value of TRUE
for all conditions.
Note that when a condition evaluates to NA
the row will be dropped,
unlike base subsetting with [
.
Usage
# S3 method for class 'duckplyr_df'
filter(.data, ..., .by = NULL, .preserve = FALSE)
Arguments
- .data
A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). See Methods, below, for more details.
- ...
<
data-masking
> Expressions that return a logical value, and are defined in terms of the variables in.data
. If multiple expressions are included, they are combined with the&
operator. Only rows for which all conditions evaluate toTRUE
are kept.- .by
-
<
tidy-select
> Optionally, a selection of columns to group by for just this operation, functioning as an alternative togroup_by()
. For details and examples, see ?dplyr_by. - .preserve
Relevant when the
.data
input is grouped. If.preserve = FALSE
(the default), the grouping structure is recalculated based on the resulting data, otherwise the grouping is kept as is.
Fallbacks
There is no DuckDB translation in filter.duckplyr_df()
with no filter conditions,
nor for a grouped operation (if
.by
is set).
These features fall back to dplyr::filter()
, see vignette("fallback")
for details.
Examples
df <- duckdb_tibble(x = 1:3, y = 3:1)
filter(df, x >= 2)
#> # A duckplyr data frame: 2 variables
#> x y
#> <int> <int>
#> 1 2 2
#> 2 3 1