˲ Occupancy: How many pointers to
child items the node is currently holding, out of the maximum available. For
example, if the tree-branching factor
is N, and the node is currently holding
N/2 pointers, occupancy is 50%.
˲ Height: The number of B-tree levels,
signifying how many pointers have to
be followed during lookup.
Every non leaf node in the tree holds
up to N- 1 keys (index entries), separating the tree into N subtrees that can
be located by following a corresponding pointer. Pointer i from an entry Ki
points to a subtree in which all index
entries are such that Ki- 1 <= Ksearched < Ki
(where K is a set of keys). The first and
last pointers are special cases, pointing to subtrees in which all the entries
are less than or equal to K0 in the case
of the leftmost child, or greater than
KN in the case of the rightmost child. A
leaf node may also hold a pointer to the
previous and next nodes on the same
level, forming a doubly linked list of
sibling nodes. Keys in all the nodes are
Lookups. When performing lookups, the search starts at the root node
and follows internal nodes recursively
down to the leaf level. On each level, the
search space is reduced to the child subtree (the range of this subtree includes
the searched value) by following the
child pointer. Figure 3 shows a lookup
in a B-tree making a single root-to-leaf
pass, following the pointers “between”
the two keys, one of which is greater
than (or equal to) the searched term,
and the other of which is less than the
searched term. When a point query is
performed, the search is complete after locating the leaf node. During the
range scan, the keys and values of the
found leaf, and then the sibling leaf’s
nodes, are traversed, until the end of
the range is reached.
In terms of complexity, B-trees guar-
antee log(n) lookup, because finding a
key within the node is performed using
binary search, shown in Figure 4. Bi-
nary search is easily explained in terms
of searching for words beginning with
a certain letter in the dictionary, where
all words are sorted alphabetically.
First you open the dictionary exactly
in the middle. If the searched letter
is alphabetically “less than” (appears
earlier than) the one opened, you con-
tinue your search in the left half of the
deletion: When a B-tree node is full, it
is split in two, and when the occupan-
cy of the neighboring nodes falls be-
low a certain threshold, the nodes are
merged. This also means that leaves
are equally distant from the root and
can be located using the same amount
of steps during lookup.
˲ Guarantee of logarithmic lookup
time. This makes B-trees a good choice
for database indexes, where lookup
times are important.
˲ Mutable. Inserts, updates, and deletes (also, subsequent splits and merges) are performed on disk in place.
To make in-place updates possible, a
certain amount of space overhead is
required. A B-tree can be organized as
a clustered index, where actual data is
stored on the leaf nodes or as a heap
file with an unclustered B-tree index.
This article discusses the B+tree, 5
a modern variant of the B-tree often
used for database storage. The B+tree
is different from the original B-tree3
in that it has an additional level of
linked leaf nodes holding the values,
and these values cannot be stored on
Anatomy of the B-tree. Let’s first
take a closer look at the B-tree building blocks, illustrated in Figure 2. B-trees have several node types: root,
internal, and leaf. Root (top) is the
node that has no parents (that is, it is
not a child of any other node). Internal
nodes (middle) have both a parent and
children; they connect a root node with
leaf nodes. Leaf nodes (bottom) carry
the data and have no children. Figure 2
depicts a B-tree with a branching factor
of four (four pointers, three keys in internal nodes, and four key/value pairs
B-trees are characterized by the following:
˲ Branching factor: The number (N)
of pointers to the child nodes. Along
with the pointers, root and internal
nodes hold up to N- 1 keys.
Figure 1. Example of rotation used for
balancing in binary tree.
Figure 2. Example of a B-tree.
Figure 3. Single root-to-leaf pass.
Figure 4. Binary search of a B-tree.
1 3 4 6 7 8 10 13 14 18 19 21 24 37 40 45 71