Let be a set of B-trees, with the largest tree height and the shortest tree height , and be a set of separator keys, where . The join operation can be performed in I/O span and I/O work.
Leads to a span-inefficient solution
Union with above primitive leads to I/O Span
B-way Join Improved
Grouping
Fuse within Groups
Divide Large Nodes
Divide Smaller Nodes
Concatenate and Rebalance
Fuse within Groups
Do not recurse into subgroups of tall trees
Fuse keys and nodes at their heights directly
Maintain list of pointers along the left and right spines
Finding the node a root merges with takes I/O span
Dividing Large Nodes
Divide node if contains keys
Any node contains at most keys
After, each node contains at most keys
Each subsequent dividing of nodes will push up at most 1 key
Dividing Smaller Nodes
Multi-way Join I/O Span and Work
Theorem 2
We can join B-trees and keys together in parallel with I/O span and I/O work, where is in the Fork-Join I/O model.
Multi-way Split I/O Span and Work
Theorem 3
We can split a B-tree by keys in parallel with I/O span and I/O work, where is in the Fork-Join I/O Model.
Union Operation
Divide each tree into sections
Use the Multi-way Split operation
Perform a union on each pair of subtrees
Use Multi-way Join to combine all elements in the tree
Minor modifications may be made to Multi-way Join to get Intersection and Difference operations
Union Operation
Theorem 4
Given two B-trees with sizes and , there exists a parallel algorithm that returns a new B-tree containing the union, intersection, and set difference of the two input trees in and I/O work, I/O span, where is the block size.