Introduction
An initial Java implementation of the Adaptive Radix Tree (ART) data structure as discussed in the paper “The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases” by Viktor Leis.
In this post, we’ll cover:
- the key break down for navigating the tree.
- the different node type structures.
The source can be found on GitHub
Implementation
Keys
The long key breaks down into 1-byte / 8-bit chunks, with each byte identifying the path at each level.
Let’s consider the long value: 100000000779682635.
This is equivalent to 8 x 1-byte chunks: 1,99,69,120,140,3,3,75.
Isolating this in Java can be done simply with a bit shift, and a mask (255L) to remove the bits to the left that are unwanted:
100000000779682635 >>> 8 & 255L
The above shift removes the first byte and would return the value 3 (which is the 3 left of 75).
Now this approach can be applied to navigating the radix tree, where each node has a level/depth that drives bit shift. The bit shift extracts the key (0-255) for that level.
Let us work through the first three levels:
-
The root node has a depth-level 7 and so needs a 56 (7*8) bit shift extract the key for depth-level 7:
100000000779682635 >>> 56 & 255L = 1
So the pointer to the next node (depth-level 6) is placed in index position 1.
-
The depth-level 6 node needs a 48 (6*8) bit shift to extract the key for depth-level 6:
100000000779682635 >>> 48 & 255L = 99
So the pointer to the next node (depth-level 5) is placed in index position 99.
-
The depth-level 5 node needs a 40 (5*8) bit shift to extract the key for depth-level 5:
100000000779682635 >>> 40 & 255L = 69
So the pointer to the next node (depth-level 4) is placed in index position 69. And so on…
Node Types
The internal structure for the four node types.
ArtNode4
short[] keys = new short[4]
Object[] nodes = new Object[4]
The keys array holds the key value for the pointer, the nodes array holds the reference to the next node. The key and node align by their index position in their respective arrays.
ArtNode16
short[] keys = new short[16];
Object[] nodes = new Object[16];
The keys array holds the key value for the pointer, the nodes array holds the reference to the child node. The key and node align by their index position in their respective arrays.
Two implementations of ArtNode16 were built:
- one where the keys (and nodes) were unordered, and
- another where the keys (and nodes) were ordered.
As expected, each had different performance characteristics, the unordered (*Linear) implementation performed better on get(), put() and remove(). While the ordered (*BinarySearch) implementation performed better on getCeiling(), getFloor(), getFirst() and getLast().
ArtNode48
short[] keys = new short[256];
Object[] nodes = new Object[48];
The keys array holds the pointer to the nodes, where the index position represents the key value. The nodes array holds the reference to the child node.
ArtNode256
Object[] nodes = new Object[256];
The nodes array holds the reference the child node, where the index positions represents the key value.
Next
In the next post, let’s run through the ordered and unordered ArtNode16 implementations.