Published on

Trees Implementation and Algorithms in C Intermediate C Concepts Part 4

Authors

Having explored linear structures like Linked Lists and the concepts of Stacks and Queues, let's branch out into the fascinating realm of non-linear data structures: Trees. Trees are incredibly versatile, used everywhere from file systems and databases to AI and graphics. This post will introduce fundamental tree concepts, focusing heavily on Binary Trees and the powerful Binary Search Tree (BST), showing you how to implement them in C using pointers, structs, dynamic allocation, and often, recursion.

Table of Contents

What is a Tree? Basic Terminology

Unlike linear structures (arrays, linked lists) where elements follow sequentially, a Tree is a hierarchical structure consisting of Nodes connected by Edges. Think of an organizational chart or a family tree.

Key terms you'll encounter:

  • Node: The fundamental part of a tree containing data and pointers (links) to other nodes.
  • Edge: The link connecting two nodes.
  • Root: The topmost node in the tree (the only node with no parent).
  • Parent: A node that has one or more nodes connected below it.
  • Child: A node directly connected to another node when moving away from the Root.
  • Siblings: Nodes that share the same parent.
  • Leaf: A node with no children (at the bottom of a branch).
  • Internal Node: A node that has at least one child (i.e., not a leaf).
  • Path: A sequence of nodes and edges connecting a node with a descendant.
  • Height (of a Node): The number of edges on the longest path from the node down to a leaf. The height of a leaf is 0.
  • Height (of a Tree): The height of the root node.
  • Depth (of a Node): The number of edges from the root node down to that node. The depth of the root is 0.
  • Level (of a Node): Generally, Level = Depth. Level 0 is the root, Level 1 are its children, etc.

[Root (Level 0)] /
/ \


[Child A]   [Child B]  (Level 1, Siblings)
/           /     
/           /       
[Leaf C]  [Internal D] [Leaf E] (Level 2)
|
|
[Leaf F] (Level 3)

Example: Height of Node B is 2, Height of Tree is 3. Depth of Node D is 2.

Why Use Trees?

  • Representing Hierarchy: Ideal for data with inherent parent-child relationships (file systems, XML/HTML DOM, organization charts).
  • Efficient Searching/Sorting: Binary Search Trees allow quick searching, insertion, and deletion (often logarithmic time).
  • Decision Processes: Decision trees model choices and outcomes.
  • Specialized Applications: Heaps (priority queues), Tries (dictionary lookups), B-Trees (databases).

While there are many tree types, we'll focus on the most common: Binary Trees.

Binary Trees

A Binary Tree is a specific type of tree where each node can have at most two children: a left child and a right child.

Binary Tree Node Structure in C

We use a self-referential structure, similar to linked lists, but with two pointers.

#include <stdio.h>
#include <stdlib.h> // For malloc, free
#include <stdbool.h> // For bool type

// Node structure for a Binary Tree
typedef struct TreeNode {
    int data;                // Data stored in the node
    struct TreeNode *left;   // Pointer to the left child
    struct TreeNode *right;  // Pointer to the right child
} TreeNode;

// Helper function to create a new tree node
TreeNode* createNode(int data) {
    TreeNode* newNode = (TreeNode*)malloc(sizeof(TreeNode));
    if (!newNode) {
        perror("Failed to allocate tree node");
        return NULL;
    }
    newNode->data = data;
    newNode->left = NULL;
    newNode->right = NULL;
    return newNode;
}

Binary Tree Traversal Algorithms

Traversal means visiting every node in the tree exactly once. There are two main approaches: Depth-First Search (DFS) and Breadth-First Search (BFS).

1. Depth-First Search (DFS)

DFS explores as far down one branch as possible before backtracking. Recursion is a natural fit for implementing DFS.

  • Pre-order Traversal (Root, Left, Right): Visit the current node first, then recursively visit the left subtree, then the right subtree. Useful for copying trees or getting prefix expressions.

    // Pre-order traversal (Root -> Left -> Right)
    void preOrder(TreeNode* root) {
        if (root == NULL) {
            return; // Base case for recursion
        }
        printf("%d ", root->data); // Visit root
        preOrder(root->left);      // Visit left subtree
        preOrder(root->right);     // Visit right subtree
    }
    
  • In-order Traversal (Left, Root, Right): Recursively visit the left subtree, then visit the current node, then recursively visit the right subtree. Crucially important for BSTs as it visits nodes in ascending order.

    // In-order traversal (Left -> Root -> Right)
    void inOrder(TreeNode* root) {
        if (root == NULL) {
            return;
        }
        inOrder(root->left);       // Visit left subtree
        printf("%d ", root->data); // Visit root
        inOrder(root->right);      // Visit right subtree
    }
    
  • Post-order Traversal (Left, Right, Root): Recursively visit the left subtree, then the right subtree, and finally visit the current node. Useful for deleting nodes (process children before parent) or getting postfix expressions.

    // Post-order traversal (Left -> Right -> Root)
    void postOrder(TreeNode* root) {
         if (root == NULL) {
            return;
        }
        postOrder(root->left);     // Visit left subtree
        postOrder(root->right);    // Visit right subtree
        printf("%d ", root->data); // Visit root
    }
    

2. Breadth-First Search (BFS) / Level-Order Traversal

BFS explores the tree level by level, visiting all nodes at the current depth before moving to the next depth. This requires a Queue (as implemented in the previous post).

// --- Minimal Queue Implementation for Level Order ---
// (Assuming TreeNode* is the data type stored in the queue)
// You'd typically include your queue implementation from the previous post.
// For brevity, here's a placeholder interface:
typedef struct QueueNode {
    TreeNode* treeNode;
    struct QueueNode* next;
} QueueNode;

typedef struct {
    QueueNode *front, *rear;
} Queue;

Queue* createQueue() { /* Implementation needed */ }
void enqueue(Queue* q, TreeNode* tn) { /* Implementation needed */ }
TreeNode* dequeue(Queue* q) { /* Implementation needed */ }
bool isQueueEmpty(Queue* q) { /* Implementation needed */ }
void freeQueue(Queue* q) { /* Implementation needed */ }
// --- End Minimal Queue ---


// Placeholder implementations for Queue (replace with your actual Queue code)
Queue* createQueue() {
    Queue* q = (Queue*)malloc(sizeof(Queue));
    if (!q) return NULL;
    q->front = q->rear = NULL;
    return q;
}

bool isQueueEmpty(Queue* q) {
    return q->front == NULL;
}

void enqueue(Queue* q, TreeNode* tn) {
    QueueNode* temp = (QueueNode*)malloc(sizeof(QueueNode));
    if (!temp) return;
    temp->treeNode = tn;
    temp->next = NULL;
    if (q->rear == NULL) {
        q->front = q->rear = temp;
        return;
    }
    q->rear->next = temp;
    q->rear = temp;
}

TreeNode* dequeue(Queue* q) {
    if (isQueueEmpty(q)) return NULL;
    QueueNode* temp = q->front;
    TreeNode* dequeuedNode = temp->treeNode;
    q->front = q->front->next;
    if (q->front == NULL) q->rear = NULL; // Queue became empty
    free(temp);
    return dequeuedNode;
}

void freeQueue(Queue* q) {
    while (!isQueueEmpty(q)) {
        dequeue(q); // Dequeue implicitly frees QueueNodes
    }
    free(q); // Free the queue structure itself
}


// Level-order traversal (BFS)
void levelOrder(TreeNode* root) {
    if (root == NULL) {
        printf("Tree is empty.\n");
        return;
    }

    Queue* q = createQueue();
    if (!q) {
        perror("Failed to create queue for level order");
        return;
    }

    printf("Level Order: ");
    enqueue(q, root); // Start with the root

    while (!isQueueEmpty(q)) {
        TreeNode* current = dequeue(q);
        if (current == NULL) continue; // Should not happen with proper enqueue logic

        printf("%d ", current->data); // Visit the dequeued node

        // Enqueue left child if it exists
        if (current->left != NULL) {
            enqueue(q, current->left);
        }
        // Enqueue right child if it exists
        if (current->right != NULL) {
            enqueue(q, current->right);
        }
    }
    printf("\n");
    freeQueue(q); // Clean up the queue
}

Binary Search Trees (BSTs)

A Binary Search Tree (BST) is a special type of binary tree that imposes an ordering constraint, making searching very efficient.

The BST Property: For any given node in the tree:

  • All values in its left subtree are less than the node's value.
  • All values in its right subtree are greater than the node's value.
  • Both the left and right subtrees must also be binary search trees.

(Note: Variations exist regarding handling duplicate values. We'll assume no duplicates or place them typically in the right subtree for simplicity.)

The TreeNode structure is the same as for a regular binary tree.

BST Operations

  • Insertion: Add a new value while maintaining the BST property.

    // Insert a node into the BST (Recursive)
    TreeNode* insertNode(TreeNode* root, int data) {
        // 1. Base Case: If the tree/subtree is empty, create the new node here
        if (root == NULL) {
            return createNode(data);
        }
    
        // 2. Recursive Step: Compare data with root's data
        if (data < root->data) {
            // If data is smaller, go left
            root->left = insertNode(root->left, data);
        } else if (data > root->data) {
            // If data is larger, go right
            root->right = insertNode(root->right, data);
        }
        // (If data == root->data, do nothing - assuming no duplicates or handle as needed)
    
        // Return the (possibly updated) root pointer of this subtree
        return root;
    }
    
  • Searching: Efficiently find if a value exists.

    // Search for a node in the BST (Recursive)
    TreeNode* searchNode(TreeNode* root, int data) {
        // 1. Base Cases:
        //   a) Root is NULL or data is found at root
        if (root == NULL || root->data == data) {
            return root;
        }
    
        // 2. Recursive Step:
        if (data < root->data) {
            // Data is smaller, search left subtree
            return searchNode(root->left, data);
        } else {
            // Data is larger, search right subtree
            return searchNode(root->right, data);
        }
    }
    
  • Finding Minimum and Maximum: Easy due to the BST property.

    // Find the node with the minimum value in a BST (leftmost node)
    TreeNode* findMin(TreeNode* root) {
        if (root == NULL) {
            return NULL; // Empty tree/subtree
        }
        // Keep going left until we can't anymore
        TreeNode* current = root;
        while (current->left != NULL) {
            current = current->left;
        }
        return current;
    }
    
    // Find the node with the maximum value in a BST (rightmost node)
     TreeNode* findMax(TreeNode* root) {
        if (root == NULL) {
            return NULL;
        }
        TreeNode* current = root;
        while (current->right != NULL) {
            current = current->right;
        }
        return current;
    }
    
  • Deletion: The most complex operation. Three main cases:

    1. Node is a Leaf: Simply remove it (set parent's corresponding child pointer to NULL) and free the memory.
    2. Node has One Child: Replace the node with its child (update parent's child pointer to point to the node's child) and free the memory.
    3. Node has Two Children:
      • Find the node's in-order successor (the smallest value in its right subtree - use findMin on the right child).
      • Copy the successor's data into the node to be deleted.
      • Recursively delete the successor node (which will be easier as it has at most one right child).
      • (Alternatively, use the in-order predecessor: largest value in the left subtree).
    // Delete a node from the BST (Recursive)
    TreeNode* deleteNode(TreeNode* root, int data) {
        // 1. Base Case: If the tree is empty
        if (root == NULL) {
            return root; // Or return NULL
        }
    
        // 2. Find the node to delete
        if (data < root->data) {
            root->left = deleteNode(root->left, data); // Recurse left
        } else if (data > root->data) {
            root->right = deleteNode(root->right, data); // Recurse right
        } else {
            // Node with data found! Now handle deletion cases:
    
            // Case 1: Node is a leaf (no children) or has one child
            if (root->left == NULL) {
                TreeNode* temp = root->right; // Store right child (could be NULL)
                printf("Deleting node %d (0 or 1 child)\n", root->data);
                free(root);
                return temp; // Return the right child to link parent
            } else if (root->right == NULL) {
                TreeNode* temp = root->left; // Store left child
                printf("Deleting node %d (1 child)\n", root->data);
                free(root);
                return temp; // Return the left child to link parent
            }
    
            // Case 3: Node has two children
            // Find the in-order successor (smallest node in the right subtree)
            TreeNode* successor = findMin(root->right);
    
            // Copy the successor's data to this node
            printf("Deleting node %d (2 children) - replacing with %d\n", root->data, successor->data);
            root->data = successor->data;
    
            // Recursively delete the successor node from the right subtree
            root->right = deleteNode(root->right, successor->data);
        }
        return root; // Return the potentially modified root pointer
    }
    

BST Advantages and Disadvantages

  • Advantages:
    • Efficient average-case time complexity for search, insert, delete: O(log n), where n is the number of nodes.
    • In-order traversal yields data in sorted order automatically.
  • Disadvantages:
    • Performance degrades significantly for unbalanced trees (approaching O(n) in the worst case, e.g., inserting sorted data makes it like a linked list).
    • Implementation (especially deletion) can be complex.

Freeing Tree Memory

Just like with linked lists, dynamically allocated tree nodes must be freed to prevent memory leaks. Post-order traversal is ideal for this, as it ensures children are processed (and potentially freed) before their parent.

// Function to free all nodes in the tree (using post-order logic)
void freeTree(TreeNode* root) {
    if (root == NULL) {
        return;
    }
    // Recursively free left and right subtrees first
    freeTree(root->left);
    freeTree(root->right);

    // Then free the current node
    // printf("Freeing node %d\n", root->data); // Optional debug print
    free(root);
}

Beyond BSTs: Balanced Trees

The O(n) worst-case performance of BSTs is a problem. Self-Balancing Binary Search Trees like AVL Trees and Red-Black Trees perform rotations and follow stricter rules during insertion/deletion to automatically maintain a balanced structure, guaranteeing O(log n) performance even in the worst case. Their implementation is significantly more complex and beyond the scope of this introduction but represents the next step in mastering tree structures.

Example BST Usage (main)

// (Include TreeNode structure, createNode, insertNode, searchNode, deleteNode,
//  findMin, findMax, inOrder, preOrder, postOrder, levelOrder, freeTree,
//  and necessary Queue functions above)

int main() {
    TreeNode* root = NULL; // Start with an empty tree

    printf("Inserting nodes into BST...\n");
    root = insertNode(root, 50);
    root = insertNode(root, 30);
    root = insertNode(root, 70);
    root = insertNode(root, 20);
    root = insertNode(root, 40);
    root = insertNode(root, 60);
    root = insertNode(root, 80);
    root = insertNode(root, 35); // Adding a node with one child later
    root = insertNode(root, 75); // Adding another leaf


    printf("\nIn-order Traversal (Sorted): ");
    inOrder(root);
    printf("\n");

    printf("Pre-order Traversal: ");
    preOrder(root);
    printf("\n");

    printf("Post-order Traversal: ");
    postOrder(root);
    printf("\n");

    levelOrder(root); // Requires Queue implementation

    printf("\nSearching for nodes...\n");
    TreeNode* found = searchNode(root, 40);
    if (found) printf("Found node with data: %d\n", found->data);
    else printf("Node with data 40 not found.\n");

    found = searchNode(root, 99);
    if (found) printf("Found node with data: %d\n", found->data);
    else printf("Node with data 99 not found.\n");


    printf("\nFinding Min/Max...\n");
    TreeNode* minNode = findMin(root);
    if(minNode) printf("Minimum value: %d\n", minNode->data); // Should be 20
    TreeNode* maxNode = findMax(root);
     if(maxNode) printf("Maximum value: %d\n", maxNode->data); // Should be 80


    printf("\nDeleting nodes...\n");
    root = deleteNode(root, 20); // Delete Leaf
    printf("In-order after deleting 20: "); inOrder(root); printf("\n");

    root = deleteNode(root, 30); // Delete Node with two children (replace with 35)
    printf("In-order after deleting 30: "); inOrder(root); printf("\n");

    root = deleteNode(root, 70); // Delete Node with two children (replace with 75)
     printf("In-order after deleting 70: "); inOrder(root); printf("\n");

    root = deleteNode(root, 50); // Delete root (replace with 60)
     printf("In-order after deleting 50: "); inOrder(root); printf("\n");


    printf("\nFreeing the tree...\n");
    freeTree(root);
    root = NULL; // Important: Set root to NULL after freeing
    printf("Tree freed.\n");

    // Test empty tree operations (should handle gracefully)
    printf("In-order on empty tree: "); inOrder(root); printf("\n");
    levelOrder(root);


    return 0;
}

Conclusion

Trees open up a powerful way to organize and access data hierarchically and efficiently in C. We've covered the essential terminology, explored Binary Trees and their traversal methods (DFS via recursion, BFS via queues), and delved into the workings of Binary Search Trees, including insertion, searching, and the complexities of deletion. While BSTs are powerful, understanding their potential imbalance leads towards advanced self-balancing structures like AVL and Red-Black trees. Remember that careful dynamic memory management (malloc/free) and often recursion are key to working with trees effectively.

Stay tuned for the next part of our Intermediate C series, where we might explore Bitwise Operations, Function Pointers, or advanced Error Handling!