The behavior of duckplyr can be fine-tuned with several environment variables, and one option.
Options
duckdb.materialize_message
: Set to FALSE
to turn off diagnostic output from duckdb
on data frame materialization.
Currenty set to TRUE
when duckplyr is loaded.
Environment variables
DUCKPLYR_OUTPUT_ORDER
: If TRUE
, row output order is preserved.
The default may change the row order where dplyr would keep it stable.
DUCKPLYR_FORCE
: If TRUE
, fail if duckdb cannot handle a request.
DUCKPLYR_FALLBACK_INFO
: If TRUE
, print a message when a fallback to dplyr occurs
because duckdb cannot handle a request.
DUCKPLYR_CHECK_ROUNDTRIP
: If TRUE
, check if all columns are roundtripped perfectly
when creating a relational object from a data frame,
This is slow, and mostly useful for debugging.
The default is to check roundtrip of attributes.
DUCKPLYR_EXPERIMENTAL
: If TRUE
, pass experimental = TRUE
to certain duckdb functions.
Currently unused.
DUCKPLYR_METHODS_OVERWRITE
: If TRUE
, call methods_overwrite()
when the package is loaded.
See fallback for more options related to logging and uploading of fallback events.
Examples
# options(duckdb.materialize_message = FALSE)
data.frame(a = 3:1) %>%
as_duckplyr_df() %>%
inner_join(data.frame(a = 1:4), by = "a")
#> materializing:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [___coalesce(lhs.a_x, rhs.a_y) as a]
#> Join REGULAR INNER ___eq_na_matches_na(lhs.a_x, rhs.a_y)
#> Projection [a as a_x]
#> r_dataframe_scan(0x564a970a70e0)
#> Projection [a as a_y]
#> r_dataframe_scan(0x564a970b7d78)
#>
#> ---------------------
#> -- Result Columns --
#> ---------------------
#> - a (INTEGER)
#>
#> a
#> 1 1
#> 2 2
#> 3 3
rlang::with_options(duckdb.materialize_message = FALSE, {
data.frame(a = 3:1) %>%
as_duckplyr_df() %>%
inner_join(data.frame(a = 1:4), by = "a") %>%
print()
})
#> a
#> 1 1
#> 2 2
#> 3 3
# Sys.setenv(DUCKPLYR_OUTPUT_ORDER = TRUE)
data.frame(a = 3:1) %>%
as_duckplyr_df() %>%
inner_join(data.frame(a = 1:4), by = "a")
#> materializing:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [___coalesce(lhs.a_x, rhs.a_y) as a]
#> Join REGULAR INNER ___eq_na_matches_na(lhs.a_x, rhs.a_y)
#> Projection [a as a_x]
#> r_dataframe_scan(0x564a9752ed30)
#> Projection [a as a_y]
#> r_dataframe_scan(0x564a97540818)
#>
#> ---------------------
#> -- Result Columns --
#> ---------------------
#> - a (INTEGER)
#>
#> a
#> 1 1
#> 2 2
#> 3 3
withr::with_envvar(c(DUCKPLYR_OUTPUT_ORDER = "TRUE"), {
data.frame(a = 3:1) %>%
as_duckplyr_df() %>%
inner_join(data.frame(a = 1:4), by = "a")
})
#> materializing:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [___coalesce(lhs.a_x, rhs.a_y) as a]
#> Order [lhs.___row_number_x ASC, rhs.___row_number_y ASC]
#> Join REGULAR INNER ___eq_na_matches_na(lhs.a_x, rhs.a_y)
#> Projection [a_x as a_x, row_number() OVER () as ___row_number_x]
#> Projection [a as a_x]
#> r_dataframe_scan(0x564a97715b38)
#> Projection [a_y as a_y, row_number() OVER () as ___row_number_y]
#> Projection [a as a_y]
#> r_dataframe_scan(0x564a977147c0)
#>
#> ---------------------
#> -- Result Columns --
#> ---------------------
#> - a (INTEGER)
#>
#> a
#> 1 3
#> 2 2
#> 3 1
# Sys.setenv(DUCKPLYR_FORCE = TRUE)
add_one <- function(x) {
x + 1
}
data.frame(a = 3:1) %>%
as_duckplyr_df() %>%
mutate(b = add_one(a))
#> The duckplyr package is configured to fall back to dplyr when it
#> encounters an incompatibility. Fallback events can be collected and
#> uploaded for analysis to guide future development. By default, no data
#> will be collected or uploaded.
#> ℹ A fallback situation just occurred. The following information would
#> have been recorded:
#> {"version":"0.4.1","message":"No translation for function
#> `add_one`.","name":"mutate","x":{"...1":"integer"},"args":{"dots":{"...2":"add_one(...1)"},".by":"NULL",".keep":["all","used","unused","none"]}}
#> → Run `duckplyr::fallback_sitrep()` to review the current settings.
#> → Run `Sys.setenv(DUCKPLYR_FALLBACK_COLLECT = 1)` to enable fallback
#> logging, and `Sys.setenv(DUCKPLYR_FALLBACK_VERBOSE = TRUE)` in addition
#> to enable printing of fallback situations to the console.
#> → Run `duckplyr::fallback_review()` to review the available reports, and
#> `duckplyr::fallback_upload()` to upload them.
#> ℹ See `?duckplyr::fallback()` for details.
#> ℹ This message will be displayed once every eight hours.
#> a b
#> 1 3 4
#> 2 2 3
#> 3 1 2
try(withr::with_envvar(c(DUCKPLYR_FORCE = "TRUE"), {
data.frame(a = 3:1) %>%
as_duckplyr_df() %>%
mutate(b = add_one(a))
}))
#> Error in rel_find_call(expr[[1]], env) :
#> No translation for function `add_one`.
# Sys.setenv(DUCKPLYR_FALLBACK_INFO = TRUE)
withr::with_envvar(c(DUCKPLYR_FALLBACK_INFO = "TRUE"), {
data.frame(a = 3:1) %>%
as_duckplyr_df() %>%
mutate(b = add_one(a))
})
#> Error processing with relational.
#> Caused by error in `rel_find_call()` at duckplyr/R/translate.R:131:3:
#> ! No translation for function `add_one`.
#> a b
#> 1 3 4
#> 2 2 3
#> 3 1 2