3.19.2015

SemiJoin & AntiJoin

Semi-Join

A semijoin is a join between two data sets that returns a row from the first set when a matching row exists in the sub-query data set.

The database stops processing the second data set at the first match.

Thus, optimization does not duplicate rows from the first data set when multiple rows in the second data set satisfy the sub-query criteria.

A semi-join avoids returning a huge number of rows when a query only needs to determine whether a match exists.

With large data sets, this optimization can result in significant time savings over a nested loops join that must loop through every record returned by the inner query for every row in the outer query.

The optimizer can apply the semijoin optimization to nested loops joins, hash joins, and sort merge joins.

The optimizer may choose a semijoin in the following circumstances:
  • The statement uses either an IN or EXISTS clause.
  • The statement contains a subquery in the IN or EXISTS clause.
  • The IN or EXISTS clause is not contained inside an OR branch.


Anti-Join

An antijoin is a join between two data sets that returns a row from the first set when a matching row does not exist in the subquery data set.

Like a semijoin, an antijoin stops processing the subquery data set when the first match is found.

Unlike a semijoin, the antijoin only returns a row when no match is found.

An antijoin avoids unnecessary processing when a query only needs to return a row when a match does not exist.

With large data sets, this optimization can result in significant time savings over a nested loops join that must loop through every record returned by the inner query for every row in the outer query.

The optimizer can apply the antijoin optimization to nested loops joins, hash joins, and sort merge joins.

The optimizer may choose an anti-join in the following circumstances:
  • The statement uses either the NOT IN or NOT EXISTS clause.
  • The statement has a sub-query in the NOT IN or NOT EXISTS clause.
  • The NOT IN or NOT EXISTS clause is not contained inside an OR branch.