Previous slide Next slide Toggle fullscreen Open presenter view
Theoretical Parallel Algorithms and Data Structures
Ryuto Kitagawa
Committee: Michael Goodrich (Chair), Michael Dillencourt, Michael Shindler
University of California, Irvine
Ryuto Kitagawa UCI November December 1, 2025
Goal : Increase efficiency with more threads
Binary Fork-Join Model
Assume each processor shares memory
Threads may perform a fork operation
Later performs a join operation for sync
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Represent algorithm's computation as a DAG
Each node represents tasks, each edge is a dependency
Span : Longest path within the graph
Work : Total number of nodes in the graph
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Parallel Peeling IBLT in One/Two Rounds
IBLT
Peeling Analysis
Dynamic Accountable Storage
Accountable Storage Protocol
IBLT Tree
Joinable B-Tree
Fork-Join I/O Model
Multi-way Join
Ryuto Kitagawa UCI November December 1, 2025
Parallel Peeling IBLT in One/Two Rounds
Ryuto Kitagawa UCI November December 1, 2025
Invertible Bloom Lookup Table (IBLT)
Ryuto Kitagawa UCI November December 1, 2025
Invertible Bloom Lookup Table (IBLT)
Applications:
Set Reconciliation of Distributed Computing
Block Chain Reconciliation
Network Reconciliation
Ryuto Kitagawa UCI November December 1, 2025
Invertible Bloom Lookup Table (IBLT)
Will be focusing on insertion and peeling
Find difference operation simulated by pure insertions
Peeling : Listing all elements from the IBLT destructively
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Analysis Jiang, Mitzenmacher, and Thaler
rounds with hash functions
Probability and round analysis relies on a constant
Ryuto Kitagawa UCI November December 1, 2025
Constant Round Peeling Setup
hash functions
size of IBLT
Partition IBLT into subtables
Prevents same element collision
Ryuto Kitagawa UCI November December 1, 2025
Constant Round Peeling Setup
Ryuto Kitagawa UCI November December 1, 2025
Probability of 1 Round Peeling
Theorem 1
For a table of cells using hash functions, where is a constant, an IBLT peels in one round with probability at least where
An element is unpeelable if all locations have collisions
Collisions occur in a subtable with probability at most
Across all subtables:
Union bound across all elements gives
Ryuto Kitagawa UCI November December 1, 2025
Probability of 2 Round Peeling
Theorem 2
For a table of cells using hash functions, where is a constant, an IBLT peels in two rounds with probability at least
Prior proof analyzed the probability of a cell being single
Ryuto Kitagawa UCI November December 1, 2025
Probability of 2 Round Peeling
with probability
Probability of not being peeled in round =
Probability of not peeled in round from subtable =
Dependencies on other subtables complicates the analysis
Ryuto Kitagawa UCI November December 1, 2025
Dependencies on Subtables
Round: 1, 2
Round: 3, 4
Round: 5
Ryuto Kitagawa UCI November December 1, 2025
Probability of Round Peeling
Theorem 3
For a table of cells using hash functions, where is a positive integer constant and is a constant, an IBLT peels in rounds with probability at least
Ryuto Kitagawa UCI November December 1, 2025
Dynamic Accountable Storage
Ryuto Kitagawa UCI November December 1, 2025
Entrust our data to cloud storage providers
Amazon S3 advertises losing of one out of 100 billion objects per year
S3 currently stores hundreds of trillions of data
When data is lost or corrupted, we want a method of recovery
Ryuto Kitagawa UCI November December 1, 2025
Prior Work: Static Accountable Storage
Ateniese et al. developed a static protocol
Recover data with IBLT, charge for lost data
Homomorphic Tags
Ties blocks of data to tags
Users may verify server has all data they claim to have
Segment Tree
Quickly generate IBLT from server's stored data
Ryuto Kitagawa UCI November December 1, 2025
Accountable Storage Framework: Setup
Ryuto Kitagawa UCI November December 1, 2025
Accountable Storage Framework: Accountability Challenge
Ryuto Kitagawa UCI November December 1, 2025
Issues with Static Accountable Storage
Client data is often dynamic in the real world
Model assumes a malicious server
Unrealistic since users are already entrusting the service with their data
Simple, unavoidable method of thwarting the protocol
Ryuto Kitagawa UCI November December 1, 2025
Thwarting Static Accountable Storage
Static Accountable Storage charges server based on recovered data
If data unrecoverable, assume the worst and charge for everything lost
Malicious server avoids significant amounts of responsibility
Generate IBLT at the start
Server recovers any lost data
Server would only be punished when data is unrecoverable
Ryuto Kitagawa UCI November December 1, 2025
Work
Cryptographic Model
Client Space
Server Overhead
Supported Operations
Ateniese et al. (2017)
Malicious
Init, ClientAudit
Our Work
Honest But Curious
Init, Put, Delete, ClientAudit, ServerAudit
Ryuto Kitagawa UCI November December 1, 2025
IBLT Tree: BST Dynamic "Segment Tree"
Bucket leaves, saves significant amount of space
Let be the size of the bucket per leaf node
Leaf nodes contain an IBLT of elements
Internal nodes contain the union of their children
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Space:
Time cost depends on BST implementation
Protected Cartesian Tree as implementation
Initialize Tree:
Insertion/Deletion:
Construct IBLT:
Ryuto Kitagawa UCI November December 1, 2025
Dynamic Accountable Storage Framework
Ryuto Kitagawa UCI November December 1, 2025
Dynamic Accountable Storage Framework
Ryuto Kitagawa UCI November December 1, 2025
Dynamic Accountable Storage Framework
Ryuto Kitagawa UCI November December 1, 2025
Dynamic vs Static Accountable Storage Cost
Work
Client Space
Server Overhead
Init Time
Put Server Time
Delete Server Time
Ateniese et al. (2017)
N/A
N/A
Our Work
Our Work
Ryuto Kitagawa UCI November December 1, 2025
Dynamic vs Static Accountable Storage Cost
Work
Audit Client Time
Audit Server Time
Audit Proof Size
Ateniese et al. (2017)
Our Work
Our Work
Ryuto Kitagawa UCI November December 1, 2025
Parallel Joinable B-Tree in Fork-Join I/O Model
Ryuto Kitagawa UCI November December 1, 2025
Parallel Computational Model for Cached Memory
Modern computers are typically bottlenecked by memory accesses
Captured sequentially by the External Memory Model
PRAM has the parallel equivalent, PEM
Binary Fork Join model does not
Ryuto Kitagawa UCI November December 1, 2025
Each fork and join spawns and syncs at most threads
Each thread may access a block of data in a single memory access
Ryuto Kitagawa UCI November December 1, 2025
Join-based framework
Supports efficient set operations
Core methods are Join and Split
Ryuto Kitagawa UCI November December 1, 2025
Multi-way cache efficient self-balancing tree
Each node in the tree contains keys and children
Focusing on B-Way Join
Extended to the Multi-Way Join
Ryuto Kitagawa UCI November December 1, 2025
General steps for the algorithm
Grouping
Recurse and Subtree Replace
Concatenate and Rebalance
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Theorem 4
Let be a set of B-trees, with the largest tree height and the shortest tree height , and be a set of separator keys, where . The join operation can be performed in I/O span and I/O work.
Leads to a span-inefficient solution
Union with above primitive leads to I/O Span
Ryuto Kitagawa UCI November December 1, 2025
Grouping
Fuse within Groups
Divide Large Nodes
Divide Smaller Nodes
Concatenate and Rebalance
Ryuto Kitagawa UCI November December 1, 2025
Recall the grouping of tall trees
Do not recurse into subgroups
Fuse keys and nodes at their heights directly
Ryuto Kitagawa UCI November December 1, 2025
Maintain list of pointers along the left and right spines
Finding the node a root merges with takes I/O span
Ryuto Kitagawa UCI November December 1, 2025
Suppose a node contains keys
When create nodes
Push keys to parent node
After, each node contains at most keys
Each subsequent dividing of nodes will push up at most 1 key
Ryuto Kitagawa UCI November December 1, 2025
Ryuto Kitagawa UCI November December 1, 2025
Multi-way Join I/O Span and Work
Theorem 5
We can join B-trees and keys together in parallel with I/O span and I/O work, where is in the Fork-Join I/O model.
Ryuto Kitagawa UCI November December 1, 2025
Multi-way Split I/O Span and Work
Theorem 6
We can split a B-tree by keys in parallel with I/O span and I/O work, where is in the Fork-Join I/O Model.
Ryuto Kitagawa UCI November December 1, 2025
Divide each tree into sections
Use the Multi-Split operation
Perform a union on each pair of subtrees
Use Multi-Join to combine all elements in the tree
Minor modifications may be made to Multi-Join to get Intersection and Difference operations
Ryuto Kitagawa UCI November December 1, 2025
Theorem 7
Given two B-trees with sizes and , there exists a parallel algorithm that returns a new B-tree containing the union, intersection, and set difference of the two input trees in and I/O work, I/O span, where is the block size.
Ryuto Kitagawa UCI November December 1, 2025
Turning to
Removing factor entirely
Apply algorithm to adaptive sorting settings
Use Fork-Join I/O model on other algorithms
Ryuto Kitagawa UCI November December 1, 2025
Thank you!
Ryuto Kitagawa UCI November December 1, 2025
---
<!-- header: 'Introduction'
- Parallel Algorithms: 1950s
<img src="Attachments/examples-of-parallel.svg" alt="center" style="height: 512px;">
---
<!-- header: 'Background: Concurrent Memory Access'
<img src="Attachments/memory-conflicts.svg" alt="center" style="height: 32em;">
---
<!-- header: 'Background: Exclusive Read or Write'
<img src="Attachments/exclusive-memory.svg" alt="center" style="height: 32em;">
---
<!-- header: 'Background: Concurrent Read'
<img src="Attachments/concurrent-read.svg" alt="center" style="height: 32em;">
<!-- header: 'Example: Summation'
<img src="Attachments/prefix-sum-0.svg" alt="center" style="height: 32em;">
---
<!-- paginate: true
<!-- header: 'Example: Prefix Sum'
<img src="Attachments/prefix-sum-2.svg" alt="center" style="height:512px;">
---
<!-- paginate: hold
<img src="Attachments/prefix-sum-3.svg" alt="center" style="height:512px;">
---
<img src="Attachments/prefix-sum-4.svg" alt="center" style="height:512px;">
---
<img src="Attachments/prefix-sum-5.svg" alt="center" style="height:512px;">
---
<img src="Attachments/prefix-sum-6.svg" alt="center" style="height:512px;">
---
<!-- header: 'Parallel Model'
<!-- paginate: true
* Analyzing the round complexity of the operation
* Identify all the single elements in the IBLT
* Remove all single elements
---
<!-- paginate: hold
<img src="Attachments/iblt-explanation-08.svg" alt="center" style="max-width: 100%; width: 1024px;">