Skip to content

The behavior of duckplyr can be fine-tuned with several environment variables, and one option.

Options

duckdb.materialize_message: Set to FALSE to turn off diagnostic output from duckdb on data frame materialization. Currenty set to TRUE when duckplyr is loaded.

Environment variables

DUCKPLYR_OUTPUT_ORDER: If TRUE, row output order is preserved. The default may change the row order where dplyr would keep it stable.

DUCKPLYR_FORCE: If TRUE, fail if duckdb cannot handle a request.

DUCKPLYR_FALLBACK_INFO: If TRUE, print a message when a fallback to dplyr occurs because duckdb cannot handle a request.

DUCKPLYR_CHECK_ROUNDTRIP: If TRUE, check if all columns are roundtripped perfectly when creating a relational object from a data frame, This is slow, and mostly useful for debugging. The default is to check roundtrip of attributes.

DUCKPLYR_EXPERIMENTAL: If TRUE, pass experimental = TRUE to certain duckdb functions. Currently unused.

DUCKPLYR_METHODS_OVERWRITE: If TRUE, call methods_overwrite() when the package is loaded.

See fallback for more options related to logging and uploading of fallback events.

Examples

# options(duckdb.materialize_message = FALSE)
data.frame(a = 3:1) %>%
  as_duckplyr_df() %>%
  inner_join(data.frame(a = 1:4), by = "a")
#> materializing:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [___coalesce(lhs.a_x, rhs.a_y) as a]
#>   Join REGULAR INNER ___eq_na_matches_na(lhs.a_x, rhs.a_y)
#>     Projection [a as a_x]
#>       r_dataframe_scan(0x564a970a70e0)
#>     Projection [a as a_y]
#>       r_dataframe_scan(0x564a970b7d78)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (INTEGER)
#> 
#>   a
#> 1 1
#> 2 2
#> 3 3

rlang::with_options(duckdb.materialize_message = FALSE, {
  data.frame(a = 3:1) %>%
    as_duckplyr_df() %>%
    inner_join(data.frame(a = 1:4), by = "a") %>%
    print()
})
#>   a
#> 1 1
#> 2 2
#> 3 3

# Sys.setenv(DUCKPLYR_OUTPUT_ORDER = TRUE)
data.frame(a = 3:1) %>%
  as_duckplyr_df() %>%
  inner_join(data.frame(a = 1:4), by = "a")
#> materializing:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [___coalesce(lhs.a_x, rhs.a_y) as a]
#>   Join REGULAR INNER ___eq_na_matches_na(lhs.a_x, rhs.a_y)
#>     Projection [a as a_x]
#>       r_dataframe_scan(0x564a9752ed30)
#>     Projection [a as a_y]
#>       r_dataframe_scan(0x564a97540818)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (INTEGER)
#> 
#>   a
#> 1 1
#> 2 2
#> 3 3

withr::with_envvar(c(DUCKPLYR_OUTPUT_ORDER = "TRUE"), {
  data.frame(a = 3:1) %>%
    as_duckplyr_df() %>%
    inner_join(data.frame(a = 1:4), by = "a")
})
#> materializing:
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [___coalesce(lhs.a_x, rhs.a_y) as a]
#>   Order [lhs.___row_number_x ASC, rhs.___row_number_y ASC]
#>     Join REGULAR INNER ___eq_na_matches_na(lhs.a_x, rhs.a_y)
#>       Projection [a_x as a_x, row_number() OVER () as ___row_number_x]
#>         Projection [a as a_x]
#>           r_dataframe_scan(0x564a97715b38)
#>       Projection [a_y as a_y, row_number() OVER () as ___row_number_y]
#>         Projection [a as a_y]
#>           r_dataframe_scan(0x564a977147c0)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (INTEGER)
#> 
#>   a
#> 1 3
#> 2 2
#> 3 1

# Sys.setenv(DUCKPLYR_FORCE = TRUE)
add_one <- function(x) {
  x + 1
}

data.frame(a = 3:1) %>%
  as_duckplyr_df() %>%
  mutate(b = add_one(a))
#> The duckplyr package is configured to fall back to dplyr when it
#> encounters an incompatibility. Fallback events can be collected and
#> uploaded for analysis to guide future development. By default, no data
#> will be collected or uploaded.
#>  A fallback situation just occurred. The following information would
#>   have been recorded:
#>   {"version":"0.4.1","message":"No translation for function
#>   `add_one`.","name":"mutate","x":{"...1":"integer"},"args":{"dots":{"...2":"add_one(...1)"},".by":"NULL",".keep":["all","used","unused","none"]}}
#> → Run `duckplyr::fallback_sitrep()` to review the current settings.
#> → Run `Sys.setenv(DUCKPLYR_FALLBACK_COLLECT = 1)` to enable fallback
#>   logging, and `Sys.setenv(DUCKPLYR_FALLBACK_VERBOSE = TRUE)` in addition
#>   to enable printing of fallback situations to the console.
#> → Run `duckplyr::fallback_review()` to review the available reports, and
#>   `duckplyr::fallback_upload()` to upload them.
#>  See `?duckplyr::fallback()` for details.
#>  This message will be displayed once every eight hours.
#>   a b
#> 1 3 4
#> 2 2 3
#> 3 1 2

try(withr::with_envvar(c(DUCKPLYR_FORCE = "TRUE"), {
  data.frame(a = 3:1) %>%
    as_duckplyr_df() %>%
    mutate(b = add_one(a))
}))
#> Error in rel_find_call(expr[[1]], env) : 
#>   No translation for function `add_one`.

# Sys.setenv(DUCKPLYR_FALLBACK_INFO = TRUE)
withr::with_envvar(c(DUCKPLYR_FALLBACK_INFO = "TRUE"), {
  data.frame(a = 3:1) %>%
    as_duckplyr_df() %>%
    mutate(b = add_one(a))
})
#> Error processing with relational.
#> Caused by error in `rel_find_call()` at duckplyr/R/translate.R:131:3:
#> ! No translation for function `add_one`.
#>   a b
#> 1 3 4
#> 2 2 3
#> 3 1 2