Implementation of B+ Tree in C

A B+ tree is a self-balancing tree data structure that maintains sorted data and allows searching, inserting, and deletion of data in logarithmic time. In B+ trees, the actual data is stored inside the leaf nodes and the internal nodes act as pointers to the leaf nodes. In this article, we will learn the implementation of a B+ tree in C.

B+ Tree in C

B+ Tree is an extension of the B-tree with some specific characteristics that make it particularly useful for database and filesystem implementations.

Nodes in B+ Tree

In B+ tree, all the data is pointed to by the leaf nodes. Internal nodes are only used for the pointing purposes. All the leaf nodes are in the same level.

  • Internal Nodes: These contain only keys to guide the search but not actual data values. These keys are used to search for the elements inside the B+ trees. They have pointers to child nodes.
  • Leaf Nodes: These contain keys and actual data values. They are linked together to form a linked list, which allows for efficient range queries and ordered traversal.

The number of children a node can have is dependent on the order of the B+ tree.

Order of B+ Tree

The order m of a B+ tree defines the maximum number of children that an internal node can have. An internal node can have at most m children and at least m/2 childrens except the root, which can have fewer.

The root node of the B+ tree must have a minimum of two children and must contain at least one key.

Representation of B+ Tree in C

To represent a node of a B+ tree in C, we will define a structure with the following data members:

typedef struct Node {
    int* keys;
    int t; 
    struct Node** children;
    int n; 
    bool leaf; 
    struct Node* next; 
} Node;

Implementation of B+ Tree in C

We will implement the B+ tree with its basic operations. Following are some of the basic operations that are required to work with the B+ Tree data structure:

Operation

Description

Time Complexity

Space Complexity

Insert

Adds a new element into the B+ tree and maintains the tree property.

O(log n)

O(1)

Delete Key

Removes an element from the B+ tree.

O(log n)

O(1)

Search

Searches for an element in the B+ tree using the key.

O(log n)

O(1)

Split Child

Splits a full child node during the insertion.

O(m)

O(1)

Merge Node

Merges two nodes during deletion if necessary.

O(m)

O(1)

Display

Traverses the whole B+ tree and prints all the elements.

O(n)

O(1)

where:

  • n: denotes the number of nodes in the B+ Tree.
  • m: denotes the minium order of the tree.

Algorithm for B+ Tree Insert Implementation

Below is the algorithm for the insert function of the B+ tree:

  1. Check if the root node consists of at least two childrens.
  2. Continue traversing the tree until reaching a leaf node.
  3. If the leaf node is not full , insert the element into the key in increasing order.
  4. If the leaf node is full split the leaf node.
  5. After splitting the leaf node, promote the middle key to the parent node.
    1. If the parent node has space, insert the promoted key in the correct position.
    2. If the parent node is full, recursively split the parent node and promote the middle key to it’s parent.
    3. Keep iterating through this procedure until you reach the root node.
  6. If the root node is full, recursively repeat step 2- 5.

Example

The following diagram explains the workflow of inserting a new node in B+ tree:

STEP 1 – Let us take an example of seven elements :2, 4, 7, 10, 17, 21, 28 to insert into the B+ tree.

STEP 2 – Assume that the order is three, then each node must contain at least three nodes and two keys.

STEP 3 – Since we don’t have space to insert 10, which is an overflow condition in the node with three elements, we will split the middle element and then insert it.

STEP 4 – In the same way, we will insert 17 after 10 in the third node.

STEP 4 – Since we don’t have space after 17 to insert 21 in the above third node, we will split 10 from the last leaf node to insert 21 by placing 10 after 7 in the root node.

STEP 5 – To insert 28, we need to send 17 to the above node but, there is no space to insert 17 so, we need to split 7 to place 17 and finally insert 28.

Final B+ Tree after the insertion of all given elements

Algorithm for B+ Tree DeleteKey Implementation

The following is the algorithm to for the deletekey function of the B+ tree:

  1. Find the leaf node that contains the key value.
  2. If the key exists , remove it from the leaf node.
  3. If the leaf node has fewer than n/2 keys after deletion , check it’s sibling to determine if redistribution or merging is needed.
  4. Redistribute or merge with sibling:
    1. If the left sibling has more than ⌈n/2⌉ keys, move the largest key from the left sibling to the current leaf node.
    2. If the right sibling has more than ⌈n/2⌉ keys, move the smallest key from the right sibling to the current leaf node.
  5. Adjust the parent node:
    1. When merging, remove the key in the parent node that pointed to the merged node.
    2. If the parent node now has fewer than ⌈n/2⌉ keys, recursively perform steps 3 and 4 on the parent node.
    3. If merging involves the root node and the root becomes empty, decrease the height of the tree.

Example

The following diagram explains the workflow of inserting a new node in B+ tree:

Let’s assume we want to delete the key 70 from the tree shown in the image.

STEP 1 – First, we need to find the key 70 in the tree. In the given tree, 70 is located in the leaf node [70, 80].

STEP 2 – Remove the key 70 from the leaf node. After removal, the leaf node will contain [80].

STEP 3 – The leaf node [80] after deletion has 1 key, which is the minimum required, so no underflow occurs.

Algorithm for B+ Tree Search Implementation

The following is the algorithm to search an element into the B+ tree:

  1. Start from the root node.
  2. If the current node is a leaf:
    1. Check each key in the node. If a key matches the search key, return true.
    2. If no key matches, return false.
  3. If the current node is not a leaf node:
    1. Iterate through the keys in the node.
    2. If a key in the node is greater than the search key, recursively search the child node corresponding to the current position.
    3. If no such key is found, recursively search the rightmost child node.

Algorithm for B+ Tree Split Implementation

Below is the algorithm for the Split Child function of the B+ tree:

  1. Create a new node newChild.
  2. Identify the middle key in the child node (child) to split the node around it.
  3. Move the keys and children after the middle key from the child node to the newChild node.
  4. Place the middle key into the parent node at the suitable position.
  5. Update the parent node to point to the newChild.
  6. Retain the keys and children up to the middle key in the original child node.


Algorithm for B+ Tree MergeNode Implementation

Below is the algorithm for the Merge Node function of the B+ tree:

  1. Add the separator key from the parent node (which separates the leftNode and rightNode) to the leftNode.
  2. Move all keys and children from the rightNode to the leftNode.
  3. Remove the separator key from the parent node.
  4. Remove the pointer to the rightNode from the parent node.


Algorithm for B+ Tree Traversal

The following is the algorithm for the Display function of the B+ tree:

  1. Start from the root node.
  2. If the present node is a leaf node:
    1. Print all keys in the node.
  3. If the present node is not a leaf node:
    1. For each key in the node, recursively traverse the corresponding child node.
    2. After traversing a child node, print the current key.
    3. Traverse to the rightmost child node after the last key.

C Program to Implement B+ Tree

The following program illustrates how we can implement the B+ Tree in C:

C
// C Program to Implement B+ Tree
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>

#define MIN_DEGREE                                         \
    3 // Minimum degree (defines the range for number of
      // keys)

typedef struct Node {
    // Array of keys
    int* keys;
    // Minimum degree (defines the range for number of keys)
    int t;
    // Array of child pointers
    struct Node** children;
    // Current number of keys
    int n;
    // To determine whether the node is leaf or not
    bool leaf;
    // Pointer to next leaf node
    struct Node* next;
} Node;

typedef struct BTree {
    // Pointer to root node
    Node* root;
    // Minimum degree
    int t;
} BTree;

// Function to create a new B+ tree node
Node* createNode(int t, bool leaf)
{
    Node* newNode = (Node*)malloc(sizeof(Node));
    newNode->t = t;
    newNode->leaf = leaf;
    newNode->keys = (int*)malloc((2 * t - 1) * sizeof(int));
    newNode->children
        = (Node**)malloc((2 * t) * sizeof(Node*));
    newNode->n = 0;
    newNode->next = NULL;
    return newNode;
}

// Function to create a new B+ tree
BTree* createBTree(int t)
{
    BTree* btree = (BTree*)malloc(sizeof(BTree));
    btree->t = t;
    btree->root = createNode(t, true);
    return btree;
}

// Function to display the B+ tree and print its keys
void display(Node* node)
{
    if (node == NULL)
        return;
    int i;
    for (i = 0; i < node->n; i++) {
        if (!node->leaf) {
            display(node->children[i]);
        }
        printf("%d ", node->keys[i]);
    }
    if (!node->leaf) {
        display(node->children[i]);
    }
}

// Function to search a key in the B+ tree
bool search(Node* node, int key)
{
    int i = 0;
    while (i < node->n && key > node->keys[i]) {
        i++;
    }
    if (i < node->n && key == node->keys[i]) {
        return true;
    }
    if (node->leaf) {
        return false;
    }
    return search(node->children[i], key);
}

// Function to split the child of a node during insertion
void splitChild(Node* parent, int i, Node* child)
{
    int t = child->t;
    Node* newChild = createNode(t, child->leaf);
    newChild->n = t - 1;

    for (int j = 0; j < t - 1; j++) {
        newChild->keys[j] = child->keys[j + t];
    }

    if (!child->leaf) {
        for (int j = 0; j < t; j++) {
            newChild->children[j] = child->children[j + t];
        }
    }

    child->n = t - 1;

    for (int j = parent->n; j >= i + 1; j--) {
        parent->children[j + 1] = parent->children[j];
    }
    parent->children[i + 1] = newChild;

    for (int j = parent->n - 1; j >= i; j--) {
        parent->keys[j + 1] = parent->keys[j];
    }
    parent->keys[i] = child->keys[t - 1];
    parent->n += 1;
}

// Function to insert a non-full node
void insertNonFull(Node* node, int key)
{
    int i = node->n - 1;

    if (node->leaf) {
        while (i >= 0 && node->keys[i] > key) {
            node->keys[i + 1] = node->keys[i];
            i--;
        }
        node->keys[i + 1] = key;
        node->n += 1;
    }
    else {
        while (i >= 0 && node->keys[i] > key) {
            i--;
        }
        i++;
        if (node->children[i]->n == 2 * node->t - 1) {
            splitChild(node, i, node->children[i]);
            if (node->keys[i] < key) {
                i++;
            }
        }
        insertNonFull(node->children[i], key);
    }
}

// Function to insert a key into the B+ tree
void insert(BTree* btree, int key)
{
    Node* root = btree->root;
    if (root->n == 2 * btree->t - 1) {
        Node* newRoot = createNode(btree->t, false);
        newRoot->children[0] = root;
        splitChild(newRoot, 0, root);
        insertNonFull(newRoot, key);
        btree->root = newRoot;
    }
    else {
        insertNonFull(root, key);
    }
}

// Function prototypes for helper functions used in
// deleteKey
void deleteKeyHelper(Node* node, int key);
int findKey(Node* node, int key);
void removeFromLeaf(Node* node, int idx);
int getPredecessor(Node* node, int idx);
void fill(Node* node, int idx);
void borrowFromPrev(Node* node, int idx);
void borrowFromNext(Node* node, int idx);
void merge(Node* node, int idx);

// Function for deleting a key from the B+ tree
void deleteKey(BTree* btree, int key)
{
    Node* root = btree->root;

    // Call a helper function to delete the key recursively
    deleteKeyHelper(root, key);

    // If root has no keys left and it has a child, make its
    // first child the new root
    if (root->n == 0 && !root->leaf) {
        btree->root = root->children[0];
        free(root);
    }
}

// Helper function to recursively delete a key from the B+
// tree
void deleteKeyHelper(Node* node, int key)
{
    int idx = findKey(
        node, key); // Find the index of the key in the node

    // If key is present in this node
    if (idx < node->n && node->keys[idx] == key) {
        if (node->leaf) {
            // If the node is a leaf, simply remove the key
            removeFromLeaf(node, idx);
        }
        else {
            // If the node is not a leaf, replace the key
            // with its predecessor/successor
            int predecessor = getPredecessor(node, idx);
            node->keys[idx] = predecessor;
            // Recursively delete the predecessor
            deleteKeyHelper(node->children[idx],
                            predecessor);
        }
    }
    else {
        // If the key is not present in this node, go down
        // the appropriate child
        if (node->leaf) {
            // Key not found in the tree
            printf("Key %d not found in the B+ tree.\n",
                   key);
            return;
        }

        bool isLastChild = (idx == node->n);

        // If the child where the key is supposed to be lies
        // has less than t keys, fill that child
        if (node->children[idx]->n < node->t) {
            fill(node, idx);
        }

        // If the last child has been merged, it must have
        // merged with the previous child

        // So, we need to recursively delete the key from
        // the previous child
        if (isLastChild && idx > node->n) {
            deleteKeyHelper(node->children[idx - 1], key);
        }
        else {
            deleteKeyHelper(node->children[idx], key);
        }
    }
}
// Function to find the index of a key in a node
int findKey(Node* node, int key)
{
    int idx = 0;
    while (idx < node->n && key > node->keys[idx]) {
        idx++;
    }
    return idx;
}

// Function to remove a key from a leaf node
void removeFromLeaf(Node* node, int idx)
{
    for (int i = idx + 1; i < node->n; ++i) {
        node->keys[i - 1] = node->keys[i];
    }
    node->n--;
}

// Function to get the predecessor of a key in a non-leaf
// node
int getPredecessor(Node* node, int idx)
{
    Node* curr = node->children[idx];
    while (!curr->leaf) {
        curr = curr->children[curr->n];
    }
    return curr->keys[curr->n - 1];
}

// Function to fill up the child node present at the idx-th
// position in the node node
void fill(Node* node, int idx)
{
    if (idx != 0 && node->children[idx - 1]->n >= node->t) {
        borrowFromPrev(node, idx);
    }
    else if (idx != node->n
             && node->children[idx + 1]->n >= node->t) {
        borrowFromNext(node, idx);
    }
    else {
        if (idx != node->n) {
            merge(node, idx);
        }
        else {
            merge(node, idx - 1);
        }
    }
}

// Function to borrow a key from the previous child and move
// it to the idx-th child
void borrowFromPrev(Node* node, int idx)
{
    Node* child = node->children[idx];
    Node* sibling = node->children[idx - 1];

    // Move all keys in child one step ahead
    for (int i = child->n - 1; i >= 0; --i) {
        child->keys[i + 1] = child->keys[i];
    }

    // If child is not a leaf, move its child pointers one
    // step ahead
    if (!child->leaf) {
        for (int i = child->n; i >= 0; --i) {
            child->children[i + 1] = child->children[i];
        }
    }

    // Setting child's first key equal to node's key[idx -
    // 1]
    child->keys[0] = node->keys[idx - 1];

    // Moving sibling's last child as child's first child
    if (!child->leaf) {
        child->children[0] = sibling->children[sibling->n];
    }

    // Moving the key from the sibling to the parent
    node->keys[idx - 1] = sibling->keys[sibling->n - 1];

    // Incrementing and decrementing the key counts of child
    // and sibling respectively
    child->n += 1;
    sibling->n -= 1;
}

// Function to borrow a key from the next child and move it
// to the idx-th child
void borrowFromNext(Node* node, int idx)
{
    Node* child = node->children[idx];
    Node* sibling = node->children[idx + 1];

    // Setting child's (t - 1)th key equal to node's
    // key[idx]
    child->keys[(child->n)] = node->keys[idx];

    // If child is not a leaf, move its child pointers one
    // step ahead
    if (!child->leaf) {
        child->children[(child->n) + 1]
            = sibling->children[0];
    }

    // Setting node's idx-th key equal to sibling's first
    // key
    node->keys[idx] = sibling->keys[0];

    // Moving all keys in sibling one step behind
    for (int i = 1; i < sibling->n; ++i) {
        sibling->keys[i - 1] = sibling->keys[i];
    }

    // If sibling is not a leaf, move its child pointers one
    // step behind
    if (!sibling->leaf) {
        for (int i = 1; i <= sibling->n; ++i) {
            sibling->children[i - 1] = sibling->children[i];
        }
    }

    // Incrementing and decrementing the key counts of child
    // and sibling respectively
    child->n += 1;
    sibling->n -= 1;
}

// Function to merge idx-th child of node with (idx + 1)-th
// child of node
void merge(Node* node, int idx)
{
    Node* child = node->children[idx];
    Node* sibling = node->children[idx + 1];

    // Pulling a key from the current node and inserting it
    // into (t-1)th position of child
    child->keys[child->n] = node->keys[idx];

    // If child is not a leaf, move its child pointers one
    // step ahead
    if (!child->leaf) {
        child->children[child->n + 1]
            = sibling->children[0];
    }

    // Copying the keys from sibling to child
    for (int i = 0; i < sibling->n; ++i) {
        child->keys[i + child->n + 1] = sibling->keys[i];
    }

    // If child is not a leaf, copy the children pointers as
    // well
    if (!child->leaf) {
        for (int i = 0; i <= sibling->n; ++i) {
            child->children[i + child->n + 1]
                = sibling->children[i];
        }
    }

    // Move all keys after idx in the current node one step
    // before, so as to fill the gap created by moving
    // keys[idx] to child
    for (int i = idx + 1; i < node->n; ++i) {
        node->keys[i - 1] = node->keys[i];
    }

    // Move the child pointers after (idx + 1) in the
    // current node one step before
    for (int i = idx + 2; i <= node->n; ++i) {
        node->children[i - 1] = node->children[i];
    }

    // Update the key count of child and current node
    child->n += sibling->n + 1;
    node->n--;

    // Free the memory occupied by sibling
    free(sibling);
}

int main()
{
    BTree* btree = createBTree(MIN_DEGREE);

    // Insert elements into the B+ tree
    insert(btree, 2);
    insert(btree, 4);
    insert(btree, 7);
    insert(btree, 10);
    insert(btree, 17);
    insert(btree, 21);
    insert(btree, 28);

    // Print the B+ tree
    printf("B+ Tree after insertion: ");
    display(btree->root);
    printf("\n");

    // Search for a key
    int key_to_search = 17;
    bool found = search(btree->root, key_to_search);

    if (found) {
        printf("Key %d found in the B+ tree.\n",
               key_to_search);
    }
    else {
        printf("Key %d not found in the B+ tree.\n",
               key_to_search);
    }

    // Delete element from the B+ tree
    deleteKey(btree, 17);

    // Print the B+ tree after deletion
    printf("B+ Tree after deletion: ");
    display(btree->root);
    printf("\n");

    found = search(btree->root, key_to_search);

    if (found) {
        printf("Key %d found in the B+ tree.\n",
               key_to_search);
    }
    else {
        printf("Key %d not found in the B+ tree.\n",
               key_to_search);
    }

    return 0;
}

Output
B+ Tree after insertion: 2 4 7 10 17 21 28 
Key 17 found in the B+ tree.
B+ Tree after deletion: 2 4 7 10 21 28 
Key 17 not found in the B+ tree.

Related Articles

You can also go through these articles to improve your knowledge about the B+ Tree data structure: