This is a method for the dplyr::inner_join() generic.
See "Fallbacks" section for differences in implementation.
An inner_join() only keeps observations from x
that have a matching key in y.
Usage
# S3 method for class 'duckplyr_df'
inner_join(
  x,
  y,
  by = NULL,
  copy = FALSE,
  suffix = c(".x", ".y"),
  ...,
  keep = NULL,
  na_matches = c("na", "never"),
  multiple = "all",
  unmatched = "drop",
  relationship = NULL
)Arguments
- x, y
- A pair of data frames, data frame extensions (e.g. a tibble), or lazy data frames (e.g. from dbplyr or dtplyr). See Methods, below, for more details. 
- by
- A join specification created with - join_by(), or a character vector of variables to join by.- If - NULL, the default,- *_join()will perform a natural join, using all variables in common across- xand- y. A message lists the variables so that you can check they're correct; suppress the message by supplying- byexplicitly.- To join on different variables between - xand- y, use a- join_by()specification. For example,- join_by(a == b)will match- x$ato- y$b.- To join by multiple variables, use a - join_by()specification with multiple expressions. For example,- join_by(a == b, c == d)will match- x$ato- y$band- x$cto- y$d. If the column names are the same between- xand- y, you can shorten this by listing only the variable names, like- join_by(a, c).- join_by()can also be used to perform inequality, rolling, and overlap joins. See the documentation at ?join_by for details on these types of joins.- For simple equality joins, you can alternatively specify a character vector of variable names to join by. For example, - by = c("a", "b")joins- x$ato- y$aand- x$bto- y$b. If variable names differ between- xand- y, use a named character vector like- by = c("x_a" = "y_a", "x_b" = "y_b").- To perform a cross-join, generating all combinations of - xand- y, see- cross_join().
- copy
- If - xand- yare not from the same data source, and- copyis- TRUE, then- ywill be copied into the same src as- x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.
- suffix
- If there are non-joined duplicate variables in - xand- y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.
- ...
- Other parameters passed onto methods. 
- keep
- Should the join keys from both - xand- ybe preserved in the output?- If - NULL, the default, joins on equality retain only the keys from- x, while joins on inequality retain the keys from both inputs.
- If - TRUE, all keys from both inputs are retained.
- If - FALSE, only keys from- xare retained. For right and full joins, the data in key columns corresponding to rows that only exist in- yare merged into the key columns from- x. Can't be used when joining on inequality conditions.
 
- na_matches
- Should two - NAor two- NaNvalues match?
- multiple
- Handling of rows in - xwith multiple matches in- y. For each row of- x:- "all", the default, returns every match detected in- y. This is the same behavior as SQL.
- "any"returns one match detected in- y, with no guarantees on which match will be returned. It is often faster than- "first"and- "last"if you just need to detect if there is at least one match.
- "first"returns the first match detected in- y.
- "last"returns the last match detected in- y.
 
- unmatched
- How should unmatched keys that would result in dropped rows be handled? - "drop"drops unmatched keys from the result.
- "error"throws an error if unmatched keys are detected.
 - unmatchedis intended to protect you from accidentally dropping rows during a join. It only checks for unmatched keys in the input that could potentially drop rows.- For left joins, it checks - y.
- For right joins, it checks - x.
- For inner joins, it checks both - xand- y. In this case,- unmatchedis also allowed to be a character vector of length 2 to specify the behavior for- xand- yindependently.
 
- relationship
- Handling of the expected relationship between the keys of - xand- y. If the expectations chosen from the list below are invalidated, an error is thrown.- NULL, the default, doesn't expect there to be any relationship between- xand- y. However, for equality joins it will check for a many-to-many relationship (which is typically unexpected) and will warn if one occurs, encouraging you to either take a closer look at your inputs or make this relationship explicit by specifying- "many-to-many".- See the Many-to-many relationships section for more details. 
- "one-to-one"expects:- Each row in - xmatches at most 1 row in- y.
- Each row in - ymatches at most 1 row in- x.
 
- "one-to-many"expects:- Each row in - ymatches at most 1 row in- x.
 
- "many-to-one"expects:- Each row in - xmatches at most 1 row in- y.
 
- "many-to-many"doesn't perform any relationship checks, but is provided to allow you to be explicit about this relationship if you know it exists.
 - relationshipdoesn't handle cases where there are zero matches. For that, see- unmatched.
Fallbacks
There is no DuckDB translation in inner_join.duckplyr_df()
- for an implicit crossjoin, 
- for a value of the - multipleargument that isn't the default- "all".
- for a value of the - unmatchedargument that isn't the default- "drop".
These features fall back to dplyr::inner_join(), see vignette("fallback") for details.
Examples
library(duckplyr)
inner_join(band_members, band_instruments)
#> Joining with `by = join_by(name)`
#> # A tibble: 2 × 3
#>   name  band    plays 
#>   <chr> <chr>   <chr> 
#> 1 John  Beatles guitar
#> 2 Paul  Beatles bass  
