- Published on
Graphs Implementation and Algorithms in C Intermediate C Concepts Part 5
- Authors
- Name
- Md Nasim Sheikh
- @nasimStg
We've journeyed through linear structures like linked lists, explored ADTs like stacks and queues, and navigated hierarchical trees. Now, we arrive at perhaps the most general and powerful data structure for modeling connections: Graphs. From social networks and road maps to computer networks and project dependencies, graphs are everywhere! This post introduces graph fundamentals, common representation methods in C, and essential traversal algorithms. Get ready to connect the dots!
Table of Contents
What is a Graph? Basic Terminology
A Graph G is formally defined as a pair G = (V, E), where:
- V is a set of Vertices (also called Nodes). These represent the entities or items.
- E is a set of Edges. These represent the connections or relationships between pairs of vertices.
Unlike trees, graphs don't inherently have a root or a strict parent-child hierarchy. They can also contain Cycles, where you can follow a path of edges and return to your starting vertex.
Key graph terminology:
- Vertex (Node): An element in the set V.
- Edge: A connection between two vertices in E.
- Undirected Graph: Edges have no direction. If there's an edge between A and B, you can traverse from A to B and from B to A. (Think Facebook friendships).
- Directed Graph (Digraph): Edges have a direction. An edge from A to B doesn't imply an edge from B to A. (Think Twitter follows or one-way streets).
- Weighted Graph: Each edge has an associated numerical value (weight or cost). (Think road maps where edges have distances, or network connections with latency).
- Unweighted Graph: Edges simply represent a connection, without an associated weight.
- Adjacent Vertices: Two vertices are adjacent if there is an edge connecting them.
- Degree (of a Vertex - Undirected): The number of edges connected to a vertex.
- In-degree (Directed): The number of edges pointing into a vertex.
- Out-degree (Directed): The number of edges pointing away from a vertex.
- Path: A sequence of vertices where each adjacent pair is connected by an edge.
- Cycle: A path that starts and ends at the same vertex and contains at least one edge. Trees are specifically acyclic graphs.
- Connected Graph (Undirected): There is a path between any two vertices in the graph.
- Disconnected Graph: A graph composed of multiple "subgraphs" (components) that are not connected to each other.
Representing Graphs in C
How do we store the vertices and edges in memory? There are two primary methods:
1. Adjacency Matrix
- Concept: A 2D array
adjMatrix[V][V]
, whereV
is the number of vertices (often indexed 0 to V-1).- For unweighted graphs:
adjMatrix[i][j] = 1
if there's an edge from vertexi
toj
, else0
. - For weighted graphs:
adjMatrix[i][j] = weight
if there's an edge, else0
,infinity
, or some special marker. - For undirected graphs, the matrix is symmetric (
adjMatrix[i][j] == adjMatrix[j][i]
).
- For unweighted graphs:
- Structure (Example for Unweighted):
#include <stdio.h> #include <stdlib.h> typedef struct { int numVertices; int** adjMatrix; // Pointer to a dynamically allocated 2D array } GraphMatrix; // Function to create a graph represented by adjacency matrix GraphMatrix* createGraphMatrix(int vertices) { GraphMatrix* graph = (GraphMatrix*)malloc(sizeof(GraphMatrix)); if (!graph) return NULL; graph->numVertices = vertices; // Allocate rows (array of pointers) graph->adjMatrix = (int**)malloc(vertices * sizeof(int*)); if (!graph->adjMatrix) { free(graph); return NULL; } // Allocate columns for each row and initialize to 0 for (int i = 0; i < vertices; i++) { graph->adjMatrix[i] = (int*)calloc(vertices, sizeof(int)); // calloc initializes to 0 if (!graph->adjMatrix[i]) { // Cleanup allocated memory on failure while (--i >= 0) free(graph->adjMatrix[i]); free(graph->adjMatrix); free(graph); return NULL; } } return graph; } // Function to add an edge (undirected, unweighted) void addEdgeMatrix(GraphMatrix* graph, int src, int dest) { if (src >= 0 && src < graph->numVertices && dest >= 0 && dest < graph->numVertices) { graph->adjMatrix[src][dest] = 1; graph->adjMatrix[dest][src] = 1; // For undirected graph } else { fprintf(stderr, "Invalid vertex index for edge (%d, %d)\n", src, dest); } } // Function to free graph memory void freeGraphMatrix(GraphMatrix* graph) { if (!graph) return; for (int i = 0; i < graph->numVertices; i++) { free(graph->adjMatrix[i]); // Free columns } free(graph->adjMatrix); // Free rows free(graph); // Free graph structure }
- Pros: Very fast O(1) lookup to check if an edge exists between two specific vertices. Simpler concept for dense graphs (where |E| is close to |V|^2).
- Cons: Requires O(V^2) space, regardless of the number of edges. Very inefficient for sparse graphs (most real-world graphs). Adding/removing vertices is costly. Iterating over all neighbors of a vertex takes O(V) time.
2. Adjacency List
- Concept: An array (or dynamic array/vector) of linked lists. The array index corresponds to a vertex (0 to V-1).
adjList[i]
points to the head of a linked list containing all vertices adjacent to vertexi
. - Structure (Example for Unweighted):
#include <stdio.h> #include <stdlib.h> // Node structure for the adjacency list typedef struct AdjListNode { int dest; // Destination vertex index // For weighted graphs, add: int weight; struct AdjListNode* next; // Pointer to the next adjacent vertex } AdjListNode; // Structure to represent the graph itself typedef struct { int numVertices; AdjListNode** adjLists; // An array of pointers to AdjListNode (heads of lists) // Add bool is_directed; if needed } GraphList; // Function to create a new adjacency list node AdjListNode* newAdjListNode(int dest) { AdjListNode* newNode = (AdjListNode*)malloc(sizeof(AdjListNode)); if (!newNode) { perror("Node alloc failed"); return NULL; } newNode->dest = dest; newNode->next = NULL; // Initialize weight if using weighted graph return newNode; } // Function to create a graph with V vertices GraphList* createGraphList(int vertices) { GraphList* graph = (GraphList*)malloc(sizeof(GraphList)); if (!graph) { perror("Graph alloc failed"); return NULL; } graph->numVertices = vertices; // Create an array of adjacency lists (pointers) graph->adjLists = (AdjListNode**)malloc(vertices * sizeof(AdjListNode*)); if (!graph->adjLists) { perror("AdjList array alloc failed"); free(graph); return NULL; } // Initialize each adjacency list as empty (NULL) for (int i = 0; i < vertices; ++i) { graph->adjLists[i] = NULL; } return graph; } // Function to add an edge to an undirected graph void addEdgeList(GraphList* graph, int src, int dest) { if (src < 0 || src >= graph->numVertices || dest < 0 || dest >= graph->numVertices) { fprintf(stderr, "Invalid vertex index for edge (%d, %d)\n", src, dest); return; } // Add an edge from src to dest. New node is added to the front of the list. AdjListNode* newNodeDest = newAdjListNode(dest); if (!newNodeDest) return; // Allocation check newNodeDest->next = graph->adjLists[src]; // Point new node to current head graph->adjLists[src] = newNodeDest; // Update head // Since graph is undirected, add an edge from dest to src as well AdjListNode* newNodeSrc = newAdjListNode(src); if (!newNodeSrc) return; // Allocation check (tricky if first alloc succeeded!) // Consider a more robust error handling strategy newNodeSrc->next = graph->adjLists[dest]; graph->adjLists[dest] = newNodeSrc; // For directed graph, only add the first part (src to dest) } // Function to free graph memory void freeGraphList(GraphList* graph) { if (!graph) return; for (int i = 0; i < graph->numVertices; ++i) { AdjListNode* current = graph->adjLists[i]; AdjListNode* nextNode; while (current != NULL) { nextNode = current->next; free(current); current = nextNode; } } free(graph->adjLists); // Free the array of pointers free(graph); // Free the graph structure } // Helper to print the graph (for debugging) void printGraphList(GraphList* graph) { printf("Graph Adjacency Lists:\n"); for (int v = 0; v < graph->numVertices; ++v) { AdjListNode* temp = graph->adjLists[v]; printf("Vertex %d: ", v); while (temp) { printf("-> %d ", temp->dest); temp = temp->next; } printf("-> NULL\n"); } }
- Pros: Space efficient for sparse graphs O(V + E). Adding edges is efficient. Iterating over neighbors of a vertex
i
is efficient (O(degree(i))). Generally preferred for most applications. - Cons: Checking for a specific edge between
u
andv
can take time proportional to the degree ofu
.
Which to choose? Generally, Adjacency List is preferred unless the graph is very dense or you need constant-time edge existence checks frequently. We'll use Adjacency Lists for the traversal algorithms below.
Graph Traversal Algorithms
Traversal means systematically visiting all reachable vertices from a starting vertex. Because graphs can have cycles, we must keep track of visited vertices to avoid infinite loops.
The visited
Array: We typically use a boolean array visited[V]
, initialized to false
. Before visiting a vertex, we check if visited[vertex]
is true
. If not, we mark it true
and proceed.
1. Depth-First Search (DFS)
Concept: Explore as far down a path as possible before backtracking. If you hit a dead end or an already visited node, you backtrack and try another path. Uses a stack implicitly (via recursion) or explicitly.
Implementation (Recursive, Adjacency List):
#include <stdbool.h> // For bool type // Recursive DFS helper function void DFSUtil(GraphList* graph, int vertex, bool visited[]) { // Mark the current node as visited and print it visited[vertex] = true; printf("%d ", vertex); // Recur for all adjacent vertices AdjListNode* temp = graph->adjLists[vertex]; while (temp != NULL) { int adjacentVertex = temp->dest; if (!visited[adjacentVertex]) { DFSUtil(graph, adjacentVertex, visited); // Recursive call } temp = temp->next; } } // Main DFS function - handles disconnected graphs void DFS(GraphList* graph, int startVertex) { if (!graph || startVertex < 0 || startVertex >= graph->numVertices) { fprintf(stderr, "Invalid input for DFS\n"); return; } printf("DFS starting from vertex %d: ", startVertex); // Allocate and initialize visited array bool* visited = (bool*)calloc(graph->numVertices, sizeof(bool)); // calloc initializes to false if (!visited) { perror("Visited array alloc failed"); return; } // Call the recursive helper function for the starting vertex DFSUtil(graph, startVertex, visited); // Optional: If the graph might be disconnected, iterate through all vertices // to ensure all components are visited. // printf("\nChecking other components (if any):\n"); // for (int i = 0; i < graph->numVertices; i++) { // if (!visited[i]) { // printf("Component starting at %d: ", i); // DFSUtil(graph, i, visited); // printf("\n"); // } // } printf("\n"); free(visited); // Clean up visited array }
Applications: Finding paths, cycle detection, topological sorting (for Directed Acyclic Graphs - DAGs), finding connected components.
2. Breadth-First Search (BFS)
Concept: Explore vertex neighbours level by level. Visit the starting node, then all its direct neighbors, then all their neighbors, and so on. Uses a Queue.
Implementation (Iterative using Queue, Adjacency List): Requires a Queue implementation (similar to the one used in the Trees post, storing integer vertex indices).
#include <stdbool.h> // --- Minimal Queue Implementation for BFS (stores int vertex indices) --- // Replace with your actual Queue code from previous posts if available typedef struct IntQueueNode { int data; struct IntQueueNode* next; } IntQueueNode; typedef struct { IntQueueNode *front, *rear; } IntQueue; IntQueue* createIntQueue() { /* Implementation needed */ } void enqueueInt(IntQueue* q, int item) { /* Implementation needed */ } int dequeueInt(IntQueue* q) { /* Implementation needed - return -1 or error on empty */ } bool isIntQueueEmpty(IntQueue* q) { /* Implementation needed */ } void freeIntQueue(IntQueue* q) { /* Implementation needed */ } // --- End Minimal Queue --- // --- Placeholder Queue Implementation --- IntQueue* createIntQueue() { IntQueue* q = (IntQueue*)malloc(sizeof(IntQueue)); if (!q) return NULL; q->front = q->rear = NULL; return q; } bool isIntQueueEmpty(IntQueue* q) { return q->front == NULL; } void enqueueInt(IntQueue* q, int item) { IntQueueNode* temp = (IntQueueNode*)malloc(sizeof(IntQueueNode)); if (!temp) return; temp->data = item; temp->next = NULL; if (q->rear == NULL) { q->front = q->rear = temp; return; } q->rear->next = temp; q->rear = temp; } int dequeueInt(IntQueue* q) { if (isIntQueueEmpty(q)) return -1; // Error indicator IntQueueNode* temp = q->front; int item = temp->data; q->front = q->front->next; if (q->front == NULL) q->rear = NULL; free(temp); return item; } void freeIntQueue(IntQueue* q) { while (!isIntQueueEmpty(q)) { dequeueInt(q); } free(q); } // --- End Placeholder Queue Implementation --- // Main BFS function void BFS(GraphList* graph, int startVertex) { if (!graph || startVertex < 0 || startVertex >= graph->numVertices) { fprintf(stderr, "Invalid input for BFS\n"); return; } printf("BFS starting from vertex %d: ", startVertex); // Allocate and initialize visited array bool* visited = (bool*)calloc(graph->numVertices, sizeof(bool)); if (!visited) { perror("Visited array alloc failed"); return; } // Create a queue for BFS IntQueue* queue = createIntQueue(); if (!queue) { free(visited); perror("Queue creation failed"); return; } // Mark the start node as visited and enqueue it visited[startVertex] = true; enqueueInt(queue, startVertex); while (!isIntQueueEmpty(queue)) { // Dequeue a vertex and print it int currentVertex = dequeueInt(queue); if (currentVertex == -1) break; // Should not happen if logic is correct printf("%d ", currentVertex); // Get all adjacent vertices of the dequeued vertex AdjListNode* temp = graph->adjLists[currentVertex]; while (temp != NULL) { int adjVertex = temp->dest; // If an adjacent has not been visited, mark it visited and enqueue it if (!visited[adjVertex]) { visited[adjVertex] = true; enqueueInt(queue, adjVertex); } temp = temp->next; } } printf("\n"); freeIntQueue(queue); // Clean up queue free(visited); // Clean up visited array // BFS inherently explores only the connected component of startVertex. // If you need to explore disconnected graphs, call BFS multiple times // starting from unvisited nodes. }
Applications: Finding the shortest path in unweighted graphs, finding connected components, network broadcasting algorithms (like finding reachable nodes), web crawlers.
Beyond Traversal: Other Graph Algorithms
DFS and BFS are the building blocks for many more advanced graph algorithms, including:
- Shortest Path Algorithms:
- Dijkstra's Algorithm: Finds the shortest path between nodes in a weighted graph (with non-negative weights).
- Bellman-Ford Algorithm: Handles graphs with negative edge weights (and detects negative cycles).
- Minimum Spanning Tree (MST) Algorithms: Finds a subgraph that connects all vertices with the minimum possible total edge weight.
- Prim's Algorithm
- Kruskal's Algorithm
- Topological Sort: Linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every directed edge from vertex
u
to vertexv
,u
comes beforev
in the ordering. Used for scheduling tasks with dependencies.
These are typically covered in more advanced algorithm courses but rely on the graph representations and traversal concepts discussed here.
main
)
Example Usage (// (Include graph structures, create/add/free functions, DFS, BFS,
// and necessary Queue functions above)
int main() {
int V = 6; // Number of vertices
GraphList* graph = createGraphList(V);
if (!graph) return 1;
printf("Creating an undirected graph with %d vertices.\n", V);
addEdgeList(graph, 0, 1);
addEdgeList(graph, 0, 2);
addEdgeList(graph, 1, 3);
addEdgeList(graph, 1, 4);
addEdgeList(graph, 2, 4);
addEdgeList(graph, 3, 4);
addEdgeList(graph, 3, 5);
addEdgeList(graph, 4, 5);
// Print the adjacency list representation
printGraphList(graph);
printf("\n");
// Perform DFS starting from vertex 0
DFS(graph, 0);
// Perform BFS starting from vertex 0
BFS(graph, 0);
// Perform DFS starting from vertex 3
DFS(graph, 3);
// Perform BFS starting from vertex 3
BFS(graph, 3);
printf("\nFreeing the graph...\n");
freeGraphList(graph);
graph = NULL;
printf("Graph freed.\n");
return 0;
}
Conclusion
Graphs provide a powerful and flexible way to model complex relationships and networks in C. We've seen how to represent them using Adjacency Matrices and the more commonly used Adjacency Lists. Mastering the fundamental traversal algorithms, Depth-First Search (DFS) and Breadth-First Search (BFS), along with the crucial use of a visited
marker, unlocks the ability to explore graphs and solve many problems. These traversals are the foundation upon which more complex algorithms for shortest paths, minimum spanning trees, and topological sorting are built.
With graphs added to your C toolkit, you're equipped to tackle a wide range of interconnected problems! What's next in our intermediate journey? Perhaps a closer look at low-level Bitwise Operations, the power of Function Pointers, or robust Error Handling techniques in C.
Further Reading & Related Posts
- Intermediate C Concepts Part 4: All About Trees in C
- Intermediate C Concepts Part 3: Implementing Stacks and Queues in C (Stacks for DFS, Queues for BFS)
- Intermediate C Concepts Part 1: Dynamic Memory Allocation in C
- Getting Started with C Part 6: Pointers in C
- Getting Started with C Part 7: Structures and Unions in C