Implementation of B-Tree in C

The B-tree is a self-balancing ordered structured data that stores data in a set of pages and also allows efficient searching, insertion, and deletion operations. It has its origin in a generic form of binary search trees developed by Rudolf Bayer and Edward M. McCreight in 1972. It is capable of processing large datasets efficiently. Among database systems and file systems, B-trees are quite often employed. This is so because they have a well-balanced structure, which allows for high speed performance, and they can handle a great number of elements.

Representation of B-Tree in C

A B-tree is organized as a balanced tree with the following properties-

  • Each node of the data structure can contain several keys with pointers to their child nodes.
  • Balancing of nodes is done by applying limits key constraints on the maximum and minimum number of keys a node can carry.
  • Each leaf is comparable to the next, thereby eliminating any queue or wait time.
struct BTreeNode {
    int num_keys; // Number of keys currently in the node
    int keys[M-1]; // Array of keys
    struct BTreeNode *children[M]; // Array of child pointers
    bool is_leaf; // True if node is a leaf
};

To implement a B-tree in C, we define a node structure representing the elements of the tree. Here’s an example:

The ‘BTreeNode’ structure represents a node in the B-tree. It contains an array of keys and pointers to child nodes. The is_leaf flag indicates whether the node is a leaf or an internal node.

Basic Operations on B-Tree in C

Basic Operations:

  • Search: Move down the tree repeatedly in a recursive way until a key is found.
  • Insertion: Put the key into a bunch of leaf nodes and split the nodes if it is necessary.
  • Deletion: Take away the leaves from the tree, join or split the nodes, if needed.

Time and Space Complexity:

  • Search: O(logN)
  • Insertion: O(logN)
  • Deletion: O(logN)

C Program to Implement B-Tree in C

C
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

#define M 4 // Maximum degree of the B-tree

struct BTreeNode {
    int num_keys; // Number of keys currently in the node
    int keys[M-1]; // Array of keys
    struct BTreeNode *children[M]; // Array of child pointers
    bool is_leaf; // True if node is a leaf
};

// Function to create a new node
struct BTreeNode *createNode(bool is_leaf) {
    struct BTreeNode *newNode = (struct BTreeNode *)malloc(sizeof(struct BTreeNode));
    if (newNode == NULL) {
        perror("Memory allocation failed");
        exit(EXIT_FAILURE);
    }
    newNode->num_keys = 0;
    newNode->is_leaf = is_leaf;
    for (int i = 0; i < M; i++) {
        newNode->children[i] = NULL;
    }
    return newNode;
}

// Function to split a full child node
void splitChild(struct BTreeNode *parent, int index) {
    struct BTreeNode *child = parent->children[index];
    struct BTreeNode *newNode = createNode(child->is_leaf);
    
    newNode->num_keys = M/2 - 1;
    
    // Move keys and children to the new node
    for (int i = 0; i < M/2 - 1; i++) {
        newNode->keys[i] = child->keys[i + M/2];
    }
    
    if (!child->is_leaf) {
        for (int i = 0; i < M/2; i++) {
            newNode->children[i] = child->children[i + M/2];
        }
    }
    
    child->num_keys = M/2 - 1;
    
    // Shift parent's children to make space for the new node
    for (int i = parent->num_keys; i > index; i--) {
        parent->children[i + 1] = parent->children[i];
    }
    
    parent->children[index + 1] = newNode;
    
    // Shift parent's keys to insert the middle key from the child
    for (int i = parent->num_keys - 1; i >= index; i--) {
        parent->keys[i + 1] = parent->keys[i];
    }
    
    parent->keys[index] = child->keys[M/2 - 1];
    parent->num_keys++;
}

// Function to insert a key into a non-full node
void insertNonFull(struct BTreeNode *node, int key) {
    int i = node->num_keys - 1;
    
    if (node->is_leaf) {
        // Insert key into the sorted order
        while (i >= 0 && node->keys[i] > key) {
            node->keys[i + 1] = node->keys[i];
            i--;
        }
        node->keys[i + 1] = key;
        node->num_keys++;
    } else {
        // Find the child to insert the key
        while (i >= 0 && node->keys[i] > key) {
            i--;
        }
        i++;
        
        if (node->children[i]->num_keys == M - 1) {
            // Split child if it's full
            splitChild(node, i);
            
            // Determine which of the two children is the new one
            if (node->keys[i] < key) {
                i++;
            }
        }
        insertNonFull(node->children[i], key);
    }
}

// Function to insert a key into the B-tree
void insert(struct BTreeNode **root, int key) {
    struct BTreeNode *node = *root;

    if (node == NULL) {
        // Create a new root node
        *root = createNode(true);
        (*root)->keys[0] = key;
        (*root)->num_keys = 1;
    } else {
        if (node->num_keys == M - 1) {
            // Split the root if it's full
            struct BTreeNode *new_root = createNode(false);
            new_root->children[0] = node;
            splitChild(new_root, 0);
            *root = new_root;
        }
        insertNonFull(*root, key);
    }
}

// Function to traverse and print the B-tree in-order
void traverse(struct BTreeNode *root) {
    if (root != NULL) {
        int i;
        for (i = 0; i < root->num_keys; i++) {
            traverse(root->children[i]);
            printf("%d ", root->keys[i]);
        }
        traverse(root->children[i]);
    }
}

// Main function to test B-tree implementation
int main() {
    struct BTreeNode *root = NULL;

    insert(&root, 10);
    insert(&root, 20);
    insert(&root, 5);
    insert(&root, 6);
    insert(&root, 12);
    insert(&root, 30);

    printf("In-order traversal of the B-tree: ");
    traverse(root);
    printf("\n");

    return 0;
}

Output
In-order traversal of the B-tree: 5 6 10 12 20 30 

Applications of B-Tree

Here are some applications of B-Tree-

  1. Database indexing.
  2. Filesystem management.
  3. Metadata storage.
  4. Caching mechanisms.
  5. Multilevel indexing.

Conclusion

B-trees being the fundamental data structure frequently used in databases, file systems, key-value stores, and cache systems all boil down to the balanced nature and the efficient operations it provides. They provide the three basic functions of stores the data, processes it and thus are crucial in the designing of large scale and high performing systems in the world of applications. The knowledge about B-tree applications plays a critical part in the software development of this technology for the real world use.