duckplyr 1.1.2 (2025-09-17)
Features
Fully support
dd::...()syntax (#795).Threshold for
prudence = "thrifty"is reduced to 1000 cells when the data comes from a remote data source.Support named arguments for
dd::...()functions.
duckplyr 1.1.0 (2025-05-08)
CRAN release: 2025-05-08
This release improves compatibility with dbplyr and DuckDB. See vignette("duckdb") for details.
Features
Pass functions prefixed with
dd$directly to DuckDB, e.g.,dd$ROW()will be translated as DuckDB’sROW()function (#658).New
as_tbl()to convert to a dbplyr tbl object (#634, #685).Register Ark methods for Positron’s “Variables” pane (@DavisVaughan, #661, #678). DuckDB tibbles are no longer displayed as data frames in the “Variables” pane due to a limitation in Positron. Use
collect()to convert them to data frames if you rely on the viewer functionality.Translate
n_distinct()as macro with support forna.rm = TRUE(@joakimlinde, #572, #655).Translate
coalesce().compute()does not have a fallback, failures are reported to the client (#637).Implement
slice_head()(#640).
Bug fixes
Set functions like
union()no longer trigger materialization (#654, #692).Joins no longer materialize the input data when the package is used with
methods_overwrite()orlibrary(duckplyr)(#641).Correct formatting for controlled fallbacks with
Sys.setenv(DUCKPLYR_FALLBACK_INFO = TRUE).
Chore
Bump duckdb and pillar dependencies.
Use roxyglobals from CRAN rather than GitHub (@andreranza, #659).
Bring tools and patch up to date (@joakimlinde, #647).
Internal
rel_to_df()needsprudenceargument (#644).Fix sync scripts and add reproducible code (#639).
Check loadability of extensions in test (#636).
Documentation
Document
slice_head()as supported.Add Posit’s ROR ID (#592).
Add
vignette("duckdb")(#690).Add experimental badge.
Verbose
conflict_prefer()(#667, #684).Typos + clarification edits to “large” vignette (@mine-cetinkaya-rundel, #665).
duckplyr 1.0.1 (2025-02-21)
CRAN release: 2025-02-27
Documentation
Separate
?compute_parquetand?compute_csv(#610, #622).Italicize book title in README (@wibeasley, #607).
Fix typo in
filter(.by = ...)error message (@maelle, #611).
duckplyr 1.0.0 (2025-02-02)
CRAN release: 2025-02-07
Features
Large data
Improved support for handling large data from files and S3: ingestion with
read_parquet_duckdb()and others, and materialization withas_duckdb_tibble(),compute.duckplyr_df()andcompute_file(). Seevignette("large")for details.Control automatic materialization of duckplyr frames with the new
prudenceargument toas_duckdb_tibble(),duckdb_tibble(),compute.duckplyr_df()andcompute_file(). Seevignette("prudence")for details.
New functions
read_csv_duckdb()and others, deprecatingduckplyr_df_from_csv()anddf_from_csv()(#210, #396, #459).read_sql_duckdb()(experimental) to run SQL queries against the default DuckDB connection and return the result as a duckplyr frame (duckdb/duckdb-r#32, #397).db_exec()to execute configuration queries against the default duckdb connection (#39, #165, #227, #404, #459).duckdb_tibble()(#382, #457).as_duckdb_tibble(), replacesas_duckplyr_tibble()andas_duckplyr_df()(#383, #457) and supports dbplyr connections to a duckdb database (#86, #211, #226).compute_parquet()andcompute_csv(), implementcompute.duckplyr_df()(#409, #430).fallback_config()to create a configuration file for the settings that do not affect behavior (#216, #426).is_duckdb_tibble(), deprecatesis_duckplyr_df()(#391, #392).last_rel()to retrieve the last relation object used in materialization (#209, #375).Add
"prudent_duckplyr_df"class that stops automatic materialization and requirescollect()(#381, #390).
Translations
Partial support for
across()inmutate()andsummarise()(#296, #306, #318, @lionel-, @DavisVaughan).Implement
na.rmhandling forsum(),min(),max(),any()andall(), with fallback for window functions (#205, #566).Handle
dplyr::desc()(#550).Avoid forwarding
is.na()tois.nan()to support non-numeric data, avoid checking roundtrip for timestamp data (#482).Correctly handle missing values in
if_else().Limit number of items that can be handled with
%in%(#319).duckdb_tibble()checks if columns can be represented in DuckDB (#537).Fall back to dplyr when passing
multiplewith joins (#323).
Behavior
Depend on dplyr instead of reexporting all generics (#405). Nothing changes for users in scripts. When using duckplyr in a package, you now also need to import dplyr.
Fallback logging is now on by default, can be disabled with configuration (#422).
The default DuckDB connection is now based on a file, the location defaults to a subdirectory of
tempdir()and can be controlled with theDUCKPLYR_TEMP_DIRenvironment variable (#439, #448, #561).
Documentation
New articles:
vignette("large"),vignette("prudence"),vignette("fallback"),vignette("limits"),vignette("developers"),vignette("telemetry")(#207, #504).New
flights_df()used instead ofpalmerpenguins::penguins(#408).Move to the tidyverse GitHub organization, new repository URL https://github.com/tidyverse/duckplyr/ (#225).
Avoid base pipe in examples for compatibility with R 4.0.0 (#463, #466).
duckplyr 0.4.1 (2024-07-11)
CRAN release: 2024-07-12
Features
-
df_from_file()and related functions support multiple files (#194, #195), show a clear error message for non-stringpatharguments (#182), and create a tibble by default (#177). - New
as_duckplyr_tibble()to convert a data frame to a duckplyr tibble (#177). - Support descending sort for character and other non-numeric data (@toppyy, #92, #175).
- Avoid setting memory limit (#193).
- Check compatibility of join columns (#168, #185).
- Explicitly list supported functions, add contributing guide, add analysis scripts for GitHub activity data (#179).
Documentation
- Add contributing guide (#179).
- Show a startup message at package load if telemetry is not configured (#188, #198).
-
?df_from_fileshows how to read multiple files (#181, #186) and how to specify CSV column types (#140, #189), and is shown correctly in reference index (#173, #190). - Discuss dbplyr in README (#145, #191).
- Add analysis scripts for GitHub activity data (#179).
duckplyr 0.4.0 (2024-05-21)
CRAN release: 2024-05-21
Features
- Use built-in rfuns extension to implement equality and inequality operators, improve translation for
as.integer(),NAand%in%(#83, #154, #148, #155, #159, #160). - Reexport non-deprecated dplyr functions (#144, #163).
-
library(duckplyr)callsmethods_overwrite()(#164). - Only allow constant patterns in
grepl(). - Explicitly reject calls with named arguments for now.
- Reduce default memory limit to 1 GB.
Bug fixes
- Stricter type checks in the set operations
intersect(),setdiff(),symdiff(),union(), andunion_all()(#169). - Distinguish between constant
NAand those used in an expression (#157). -
head(-1)forwards to the default implementation (#131, #156). - Fix cli syntax for internal error message (#151).
- More careful detection of row names in data frame.
- Always check roundtrip for timestamp columns.
-
left_join()and other join functions callauto_copy(). - Only reset expression depth if it has been set before.
- Require fallback if the result contains duplicate column names when ignoring case.
-
row_number()returns integer. -
is.na(NaN)isTRUE. -
summarise(count = n(), count = n())creates only one column namedcount. - Correct wording in instructions for enabling fallback logging (@TimTaylor, #141).
Documentation
- Mention wildcards to read multiple files in
?df_from_file(@andreranza, #133, #134).
duckplyr 0.3.2 (2024-03-17)
CRAN release: 2024-03-17
Bug fixes
- Run autoupload in function so that it will be checked by static analysis (#122).
Features
- New
df_to_parquet()to write to Parquet, new convenience functionsdf_from_csv(),duckdb_df_from_csv(),df_from_parquet()andduckdb_df_from_parquet()(#87, #89, #96, #128).
duckplyr 0.3.1 (2024-03-08)
CRAN release: 2024-03-10
Bug fixes
- Forbid reuse of new columns created in
summarise()(#72, #106). -
summarise()no longer restores subclass. - Disambiguate computation of
log10()andlog(). - Fix division by zero for positive and negative numbers.
Features
- New
fallback_sitrep()and related functionality for collecting telemetry data (#102, #107, #110, #111, #115). No data is collected by default, only a message is displayed once per session and then every eight hours. Opt in or opt out by setting environment variables. - Implement
group_by()and other methods to collect fallback information (#94, #104, #105). - Set memory limit and temporary directory for duckdb.
- Implement
suppressWarnings()as the identity function. - Prefer
cli::cli_abort()overstop()orrlang::abort()(#114). - Translate
.data$aand.env$a. - Strict checks for column class, only supporting
integer,numeric,logical,Date,POSIXct, anddifftimefor now. - If the environment variable
DUCKPLYR_METHODS_OVERWRITEis set toTRUE, loading duckplyr automatically callsmethods_overwrite().
Documentation
-
methods_overwrite()andmethods_restore()show a message.
duckplyr 0.3.0 (2023-12-10)
CRAN release: 2023-12-11
Bug fixes
-
grepl(x = NA)gives correct results. - Fix
auto_copy()for non-data-frame input. - Add output order preservation for filters.
-
distinct()now preserves order in corner cases (#77, #78). - Consistent computation of
log(0)andlog(-1)(#75, #76).
Documentation
- Separate and explain the new relational examples (@wibeasley, #84).
Chore
- Sync with dplyr 1.1.4 (#82).
- Remove
dplyr_reconstruct()method (#48). - Render README.
- Fix code generated by
meta_replay(). - Bump constructive dependency.
- Fix output order for
arrange()in case of ties. - Update duckdb tests.
- Only implement newer
slice_sample(), notsample_n()orsample_frac()(#74). - Sync generated files (#71).
duckplyr 0.2.3 (2023-11-08)
CRAN release: 2023-11-08
Performance
- Join using
IS NOT DISTINCT FROMfor faster execution (duckdb/duckdb-r#41, #68).
duckplyr 0.2.2 (2023-10-16)
CRAN release: 2023-10-16
Bug fixes
summarise()keeps"duckplyr_df"class (#63, #64).Fix compatibility with duckdb >= 0.9.1.
Chore
Skip tests that give different output on dev tidyselect.
Import
utils::globalVariables().
duckplyr 0.2.1 (2023-09-16)
CRAN release: 2023-09-17
Improve documentation.
Work around problem with
dplyr_reconstruct()in R 4.3.Rename
duckdb_from_file()todf_from_file().Unexport private
duckdb_rel_from_df(),rel_from_df(),wrap_df()andwrap_integer().Reexport
%>%andtibble().
duckplyr 0.1.0 (2023-07-03)
CRAN release: 2023-07-07
Chore
- Add CRAN install instructions.
- Satisfy
R CMD check. - Document argument.
- Error on NOTE.
- Remove
relexpr_window()for now.
Uncategorized
Initial version, exporting: - new_relational() to construct objects of class "relational" - Generics rel_aggregate(), rel_distinct(), rel_filter(), rel_join(), rel_limit(), rel_names(), rel_order(), rel_project(), rel_set_diff(), rel_set_intersect(), rel_set_symdiff(), rel_to_df(), rel_union_all() - new_relexpr() to construct objects of class "relational_relexpr" - Expression builders relexpr_constant(), relexpr_function(), relexpr_reference(), relexpr_set_alias(), relexpr_window()
