- Published on
Trees Implementation and Algorithms in C Intermediate C Concepts Part 4
- Authors
- Name
- Md Nasim Sheikh
- @nasimStg
Having explored linear structures like Linked Lists and the concepts of Stacks and Queues, let's branch out into the fascinating realm of non-linear data structures: Trees. Trees are incredibly versatile, used everywhere from file systems and databases to AI and graphics. This post will introduce fundamental tree concepts, focusing heavily on Binary Trees and the powerful Binary Search Tree (BST), showing you how to implement them in C using pointers, structs, dynamic allocation, and often, recursion.
Table of Contents
- What is a Tree? Basic Terminology
- Why Use Trees?
- Binary Trees
- Binary Tree Node Structure in C
- Binary Tree Traversal Algorithms
- Binary Search Trees (BSTs)
- BST Operations
- BST Advantages and Disadvantages
- Freeing Tree Memory
- Beyond BSTs: Balanced Trees
- Example BST Usage (main)
- Conclusion
- Further Reading & Related Posts
What is a Tree? Basic Terminology
Unlike linear structures (arrays, linked lists) where elements follow sequentially, a Tree is a hierarchical structure consisting of Nodes connected by Edges. Think of an organizational chart or a family tree.
Key terms you'll encounter:
- Node: The fundamental part of a tree containing data and pointers (links) to other nodes.
- Edge: The link connecting two nodes.
- Root: The topmost node in the tree (the only node with no parent).
- Parent: A node that has one or more nodes connected below it.
- Child: A node directly connected to another node when moving away from the Root.
- Siblings: Nodes that share the same parent.
- Leaf: A node with no children (at the bottom of a branch).
- Internal Node: A node that has at least one child (i.e., not a leaf).
- Path: A sequence of nodes and edges connecting a node with a descendant.
- Height (of a Node): The number of edges on the longest path from the node down to a leaf. The height of a leaf is 0.
- Height (of a Tree): The height of the root node.
- Depth (of a Node): The number of edges from the root node down to that node. The depth of the root is 0.
- Level (of a Node): Generally, Level = Depth. Level 0 is the root, Level 1 are its children, etc.
[Root (Level 0)] /
/ \
[Child A] [Child B] (Level 1, Siblings)
/ /
/ /
[Leaf C] [Internal D] [Leaf E] (Level 2)
|
|
[Leaf F] (Level 3)
Example: Height of Node B is 2, Height of Tree is 3. Depth of Node D is 2.
Why Use Trees?
- Representing Hierarchy: Ideal for data with inherent parent-child relationships (file systems, XML/HTML DOM, organization charts).
- Efficient Searching/Sorting: Binary Search Trees allow quick searching, insertion, and deletion (often logarithmic time).
- Decision Processes: Decision trees model choices and outcomes.
- Specialized Applications: Heaps (priority queues), Tries (dictionary lookups), B-Trees (databases).
While there are many tree types, we'll focus on the most common: Binary Trees.
Binary Trees
A Binary Tree is a specific type of tree where each node can have at most two children: a left child and a right child.
Binary Tree Node Structure in C
We use a self-referential structure, similar to linked lists, but with two pointers.
#include <stdio.h>
#include <stdlib.h> // For malloc, free
#include <stdbool.h> // For bool type
// Node structure for a Binary Tree
typedef struct TreeNode {
int data; // Data stored in the node
struct TreeNode *left; // Pointer to the left child
struct TreeNode *right; // Pointer to the right child
} TreeNode;
// Helper function to create a new tree node
TreeNode* createNode(int data) {
TreeNode* newNode = (TreeNode*)malloc(sizeof(TreeNode));
if (!newNode) {
perror("Failed to allocate tree node");
return NULL;
}
newNode->data = data;
newNode->left = NULL;
newNode->right = NULL;
return newNode;
}
Binary Tree Traversal Algorithms
Traversal means visiting every node in the tree exactly once. There are two main approaches: Depth-First Search (DFS) and Breadth-First Search (BFS).
1. Depth-First Search (DFS)
DFS explores as far down one branch as possible before backtracking. Recursion is a natural fit for implementing DFS.
Pre-order Traversal (Root, Left, Right): Visit the current node first, then recursively visit the left subtree, then the right subtree. Useful for copying trees or getting prefix expressions.
// Pre-order traversal (Root -> Left -> Right) void preOrder(TreeNode* root) { if (root == NULL) { return; // Base case for recursion } printf("%d ", root->data); // Visit root preOrder(root->left); // Visit left subtree preOrder(root->right); // Visit right subtree }
In-order Traversal (Left, Root, Right): Recursively visit the left subtree, then visit the current node, then recursively visit the right subtree. Crucially important for BSTs as it visits nodes in ascending order.
// In-order traversal (Left -> Root -> Right) void inOrder(TreeNode* root) { if (root == NULL) { return; } inOrder(root->left); // Visit left subtree printf("%d ", root->data); // Visit root inOrder(root->right); // Visit right subtree }
Post-order Traversal (Left, Right, Root): Recursively visit the left subtree, then the right subtree, and finally visit the current node. Useful for deleting nodes (process children before parent) or getting postfix expressions.
// Post-order traversal (Left -> Right -> Root) void postOrder(TreeNode* root) { if (root == NULL) { return; } postOrder(root->left); // Visit left subtree postOrder(root->right); // Visit right subtree printf("%d ", root->data); // Visit root }
2. Breadth-First Search (BFS) / Level-Order Traversal
BFS explores the tree level by level, visiting all nodes at the current depth before moving to the next depth. This requires a Queue (as implemented in the previous post).
// --- Minimal Queue Implementation for Level Order ---
// (Assuming TreeNode* is the data type stored in the queue)
// You'd typically include your queue implementation from the previous post.
// For brevity, here's a placeholder interface:
typedef struct QueueNode {
TreeNode* treeNode;
struct QueueNode* next;
} QueueNode;
typedef struct {
QueueNode *front, *rear;
} Queue;
Queue* createQueue() { /* Implementation needed */ }
void enqueue(Queue* q, TreeNode* tn) { /* Implementation needed */ }
TreeNode* dequeue(Queue* q) { /* Implementation needed */ }
bool isQueueEmpty(Queue* q) { /* Implementation needed */ }
void freeQueue(Queue* q) { /* Implementation needed */ }
// --- End Minimal Queue ---
// Placeholder implementations for Queue (replace with your actual Queue code)
Queue* createQueue() {
Queue* q = (Queue*)malloc(sizeof(Queue));
if (!q) return NULL;
q->front = q->rear = NULL;
return q;
}
bool isQueueEmpty(Queue* q) {
return q->front == NULL;
}
void enqueue(Queue* q, TreeNode* tn) {
QueueNode* temp = (QueueNode*)malloc(sizeof(QueueNode));
if (!temp) return;
temp->treeNode = tn;
temp->next = NULL;
if (q->rear == NULL) {
q->front = q->rear = temp;
return;
}
q->rear->next = temp;
q->rear = temp;
}
TreeNode* dequeue(Queue* q) {
if (isQueueEmpty(q)) return NULL;
QueueNode* temp = q->front;
TreeNode* dequeuedNode = temp->treeNode;
q->front = q->front->next;
if (q->front == NULL) q->rear = NULL; // Queue became empty
free(temp);
return dequeuedNode;
}
void freeQueue(Queue* q) {
while (!isQueueEmpty(q)) {
dequeue(q); // Dequeue implicitly frees QueueNodes
}
free(q); // Free the queue structure itself
}
// Level-order traversal (BFS)
void levelOrder(TreeNode* root) {
if (root == NULL) {
printf("Tree is empty.\n");
return;
}
Queue* q = createQueue();
if (!q) {
perror("Failed to create queue for level order");
return;
}
printf("Level Order: ");
enqueue(q, root); // Start with the root
while (!isQueueEmpty(q)) {
TreeNode* current = dequeue(q);
if (current == NULL) continue; // Should not happen with proper enqueue logic
printf("%d ", current->data); // Visit the dequeued node
// Enqueue left child if it exists
if (current->left != NULL) {
enqueue(q, current->left);
}
// Enqueue right child if it exists
if (current->right != NULL) {
enqueue(q, current->right);
}
}
printf("\n");
freeQueue(q); // Clean up the queue
}
Binary Search Trees (BSTs)
A Binary Search Tree (BST) is a special type of binary tree that imposes an ordering constraint, making searching very efficient.
The BST Property: For any given node in the tree:
- All values in its left subtree are less than the node's value.
- All values in its right subtree are greater than the node's value.
- Both the left and right subtrees must also be binary search trees.
(Note: Variations exist regarding handling duplicate values. We'll assume no duplicates or place them typically in the right subtree for simplicity.)
The TreeNode
structure is the same as for a regular binary tree.
BST Operations
Insertion: Add a new value while maintaining the BST property.
// Insert a node into the BST (Recursive) TreeNode* insertNode(TreeNode* root, int data) { // 1. Base Case: If the tree/subtree is empty, create the new node here if (root == NULL) { return createNode(data); } // 2. Recursive Step: Compare data with root's data if (data < root->data) { // If data is smaller, go left root->left = insertNode(root->left, data); } else if (data > root->data) { // If data is larger, go right root->right = insertNode(root->right, data); } // (If data == root->data, do nothing - assuming no duplicates or handle as needed) // Return the (possibly updated) root pointer of this subtree return root; }
Searching: Efficiently find if a value exists.
// Search for a node in the BST (Recursive) TreeNode* searchNode(TreeNode* root, int data) { // 1. Base Cases: // a) Root is NULL or data is found at root if (root == NULL || root->data == data) { return root; } // 2. Recursive Step: if (data < root->data) { // Data is smaller, search left subtree return searchNode(root->left, data); } else { // Data is larger, search right subtree return searchNode(root->right, data); } }
Finding Minimum and Maximum: Easy due to the BST property.
// Find the node with the minimum value in a BST (leftmost node) TreeNode* findMin(TreeNode* root) { if (root == NULL) { return NULL; // Empty tree/subtree } // Keep going left until we can't anymore TreeNode* current = root; while (current->left != NULL) { current = current->left; } return current; } // Find the node with the maximum value in a BST (rightmost node) TreeNode* findMax(TreeNode* root) { if (root == NULL) { return NULL; } TreeNode* current = root; while (current->right != NULL) { current = current->right; } return current; }
Deletion: The most complex operation. Three main cases:
- Node is a Leaf: Simply remove it (set parent's corresponding child pointer to
NULL
) and free the memory. - Node has One Child: Replace the node with its child (update parent's child pointer to point to the node's child) and free the memory.
- Node has Two Children:
- Find the node's in-order successor (the smallest value in its right subtree - use
findMin
on the right child). - Copy the successor's data into the node to be deleted.
- Recursively delete the successor node (which will be easier as it has at most one right child).
- (Alternatively, use the in-order predecessor: largest value in the left subtree).
- Find the node's in-order successor (the smallest value in its right subtree - use
// Delete a node from the BST (Recursive) TreeNode* deleteNode(TreeNode* root, int data) { // 1. Base Case: If the tree is empty if (root == NULL) { return root; // Or return NULL } // 2. Find the node to delete if (data < root->data) { root->left = deleteNode(root->left, data); // Recurse left } else if (data > root->data) { root->right = deleteNode(root->right, data); // Recurse right } else { // Node with data found! Now handle deletion cases: // Case 1: Node is a leaf (no children) or has one child if (root->left == NULL) { TreeNode* temp = root->right; // Store right child (could be NULL) printf("Deleting node %d (0 or 1 child)\n", root->data); free(root); return temp; // Return the right child to link parent } else if (root->right == NULL) { TreeNode* temp = root->left; // Store left child printf("Deleting node %d (1 child)\n", root->data); free(root); return temp; // Return the left child to link parent } // Case 3: Node has two children // Find the in-order successor (smallest node in the right subtree) TreeNode* successor = findMin(root->right); // Copy the successor's data to this node printf("Deleting node %d (2 children) - replacing with %d\n", root->data, successor->data); root->data = successor->data; // Recursively delete the successor node from the right subtree root->right = deleteNode(root->right, successor->data); } return root; // Return the potentially modified root pointer }
- Node is a Leaf: Simply remove it (set parent's corresponding child pointer to
BST Advantages and Disadvantages
- Advantages:
- Efficient average-case time complexity for search, insert, delete: O(log n), where n is the number of nodes.
- In-order traversal yields data in sorted order automatically.
- Disadvantages:
- Performance degrades significantly for unbalanced trees (approaching O(n) in the worst case, e.g., inserting sorted data makes it like a linked list).
- Implementation (especially deletion) can be complex.
Freeing Tree Memory
Just like with linked lists, dynamically allocated tree nodes must be freed to prevent memory leaks. Post-order traversal is ideal for this, as it ensures children are processed (and potentially freed) before their parent.
// Function to free all nodes in the tree (using post-order logic)
void freeTree(TreeNode* root) {
if (root == NULL) {
return;
}
// Recursively free left and right subtrees first
freeTree(root->left);
freeTree(root->right);
// Then free the current node
// printf("Freeing node %d\n", root->data); // Optional debug print
free(root);
}
Beyond BSTs: Balanced Trees
The O(n) worst-case performance of BSTs is a problem. Self-Balancing Binary Search Trees like AVL Trees and Red-Black Trees perform rotations and follow stricter rules during insertion/deletion to automatically maintain a balanced structure, guaranteeing O(log n) performance even in the worst case. Their implementation is significantly more complex and beyond the scope of this introduction but represents the next step in mastering tree structures.
main
)
Example BST Usage (// (Include TreeNode structure, createNode, insertNode, searchNode, deleteNode,
// findMin, findMax, inOrder, preOrder, postOrder, levelOrder, freeTree,
// and necessary Queue functions above)
int main() {
TreeNode* root = NULL; // Start with an empty tree
printf("Inserting nodes into BST...\n");
root = insertNode(root, 50);
root = insertNode(root, 30);
root = insertNode(root, 70);
root = insertNode(root, 20);
root = insertNode(root, 40);
root = insertNode(root, 60);
root = insertNode(root, 80);
root = insertNode(root, 35); // Adding a node with one child later
root = insertNode(root, 75); // Adding another leaf
printf("\nIn-order Traversal (Sorted): ");
inOrder(root);
printf("\n");
printf("Pre-order Traversal: ");
preOrder(root);
printf("\n");
printf("Post-order Traversal: ");
postOrder(root);
printf("\n");
levelOrder(root); // Requires Queue implementation
printf("\nSearching for nodes...\n");
TreeNode* found = searchNode(root, 40);
if (found) printf("Found node with data: %d\n", found->data);
else printf("Node with data 40 not found.\n");
found = searchNode(root, 99);
if (found) printf("Found node with data: %d\n", found->data);
else printf("Node with data 99 not found.\n");
printf("\nFinding Min/Max...\n");
TreeNode* minNode = findMin(root);
if(minNode) printf("Minimum value: %d\n", minNode->data); // Should be 20
TreeNode* maxNode = findMax(root);
if(maxNode) printf("Maximum value: %d\n", maxNode->data); // Should be 80
printf("\nDeleting nodes...\n");
root = deleteNode(root, 20); // Delete Leaf
printf("In-order after deleting 20: "); inOrder(root); printf("\n");
root = deleteNode(root, 30); // Delete Node with two children (replace with 35)
printf("In-order after deleting 30: "); inOrder(root); printf("\n");
root = deleteNode(root, 70); // Delete Node with two children (replace with 75)
printf("In-order after deleting 70: "); inOrder(root); printf("\n");
root = deleteNode(root, 50); // Delete root (replace with 60)
printf("In-order after deleting 50: "); inOrder(root); printf("\n");
printf("\nFreeing the tree...\n");
freeTree(root);
root = NULL; // Important: Set root to NULL after freeing
printf("Tree freed.\n");
// Test empty tree operations (should handle gracefully)
printf("In-order on empty tree: "); inOrder(root); printf("\n");
levelOrder(root);
return 0;
}
Conclusion
Trees open up a powerful way to organize and access data hierarchically and efficiently in C. We've covered the essential terminology, explored Binary Trees and their traversal methods (DFS via recursion, BFS via queues), and delved into the workings of Binary Search Trees, including insertion, searching, and the complexities of deletion. While BSTs are powerful, understanding their potential imbalance leads towards advanced self-balancing structures like AVL and Red-Black trees. Remember that careful dynamic memory management (malloc
/free
) and often recursion are key to working with trees effectively.
Stay tuned for the next part of our Intermediate C series, where we might explore Bitwise Operations, Function Pointers, or advanced Error Handling!
Further Reading & Related Posts
- Intermediate C Concepts Part 3: Implementing Stacks and Queues in C (Queues used for Level-Order Traversal)
- Intermediate C Concepts Part 2: Working with Linked Lists in C
- Intermediate C Concepts Part 1: Dynamic Memory Allocation in C
- Getting Started with C Part 6: Pointers in C
- Getting Started with C Part 7: Structures and Unions in C