Nested loop join

A nested loop join is a naive algorithm that joins two sets by using two nested loops. Join operations are important to database management.

Algorithm

Two relations R and S are joined as follows:

  For each tuple r in R do
     For each tuple s in S do
        If r and s satisfy the join condition
           Then output the tuple <r,s>

This algorithm will involve nr*bs+ br block transfers and nr+br seeks, where br and bs are number of blocks in relations R and S respectively, and nr is the number of tuples in relation R.

The algorithm runs in O(|R||S|) I/Os, where |R| and |S| is the number of tuples contained in R and S respectively. Can easily be generalized to join any number of relations.

The block nested loop join algorithm is a generalization of the simple nested loops algorithm that takes advantage of additional memory to reduce the number of times that the S relation is scanned.

Improved version

The algorithm can be improved without requesting additional memory blocks to involve only br*bs+ br block transfers. For each read block from R, the relation S can be read only once.

  For each block block_r in R do
     For each tuple s in S do
        For each tuple r in block_r do
           If r and s satisfy the join condition
              Then output the tuple <r,s>

Variable block_r is stored in memory, thus it is not needed to read it from disk for each tuple s.