Skip to content

Optimizer should bias to prefer RightSemi over LeftSemi #22931

@neilconway

Description

@neilconway

Is your feature request related to a problem or challenge?

To evaluate a semi join, we support two orientations: LeftSemi or RightSemi (analogously for anti and mark joins; I'll just refer to semijoins here to simplify the discussion). Under RightSemi, we build the non-preserved ("filter") input and stream the preserved input; these are swapped for LeftSemi. While it might seem like these two orientations are symmetrical, there are actually significant differences in evaluation behavior between them:

  • The build-side hash table has to be resident in memory; all else being equal, building the smaller join input is a good general rule, and that's the main rule we follow today.
  • RightSemi only needs to store the join keys for the build side; LeftSemi needs to store wider rows. By definition, the consumer of a semijoin can't be interested in any values from the filter side of the join. So even if the filter side has more rows than the preserved side, building the hash table on the filter side might still require less memory.
  • RightSemi preserves the partitioning of the preserved input, whereas LeftSemi + CollectLeft emits with UnknownPartitioning.
  • RightSemi works better with dynamic filter pushdown: I don't know the dynamic filter code super well, but I'd imagine that since RightSemi builds the filter side before streaming the preserved side, that gives us more information we can use to push down filters into the preserved-side scan.

Two additional factors that might change:

  • RightSemi only needs to build on distinct values from the non-preserved side. In the future, we can optimize RightSemi to discard duplicate build-side rows. We don't do that today but we might in the future (Consider optimizing RightSemi to eliminate dups from build side #22930)
  • RightSemi allows emitting join results incrementally: as we see each probe row, we can immediately determine if it should be output or not. Whereas LeftSemi consumes the entire non-preserved side, marking which of the preserved-side rows matched, and only at the end of the non-preserved input stream can we do a pass over the matched bitmap to determine which preserved-side rows to emit. This is not fundamental though; probably worth fixing LeftSemi to emit incrementally (Consider optimizing LeftSemi to emit output incrementally #22929)

The current optimizer rules don't reflect this:

  • LeftSemi and RightSemi are considered symmetrically; whichever semijoin input is predicted to be smaller is placed on the build side
  • If there are absent stats, LeftSemi is chosen

I think revising these rules as follows would make more sense:

  • Prefer RightSemi over LeftSemi, unless the non-preserved input is k times larger than the preserved input. Choosing k is a bit arbitrary, but a value in the range of 2-4 seems reasonable.
  • If there are absent stats, prefer RightSemi

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions