Published on

Graphs Implementation and Algorithms in C Intermediate C Concepts Part 5

Authors

We've journeyed through linear structures like linked lists, explored ADTs like stacks and queues, and navigated hierarchical trees. Now, we arrive at perhaps the most general and powerful data structure for modeling connections: Graphs. From social networks and road maps to computer networks and project dependencies, graphs are everywhere! This post introduces graph fundamentals, common representation methods in C, and essential traversal algorithms. Get ready to connect the dots!

Table of Contents

What is a Graph? Basic Terminology

A Graph G is formally defined as a pair G = (V, E), where:

  • V is a set of Vertices (also called Nodes). These represent the entities or items.
  • E is a set of Edges. These represent the connections or relationships between pairs of vertices.

Unlike trees, graphs don't inherently have a root or a strict parent-child hierarchy. They can also contain Cycles, where you can follow a path of edges and return to your starting vertex.

Key graph terminology:

  • Vertex (Node): An element in the set V.
  • Edge: A connection between two vertices in E.
  • Undirected Graph: Edges have no direction. If there's an edge between A and B, you can traverse from A to B and from B to A. (Think Facebook friendships).
  • Directed Graph (Digraph): Edges have a direction. An edge from A to B doesn't imply an edge from B to A. (Think Twitter follows or one-way streets).
  • Weighted Graph: Each edge has an associated numerical value (weight or cost). (Think road maps where edges have distances, or network connections with latency).
  • Unweighted Graph: Edges simply represent a connection, without an associated weight.
  • Adjacent Vertices: Two vertices are adjacent if there is an edge connecting them.
  • Degree (of a Vertex - Undirected): The number of edges connected to a vertex.
  • In-degree (Directed): The number of edges pointing into a vertex.
  • Out-degree (Directed): The number of edges pointing away from a vertex.
  • Path: A sequence of vertices where each adjacent pair is connected by an edge.
  • Cycle: A path that starts and ends at the same vertex and contains at least one edge. Trees are specifically acyclic graphs.
  • Connected Graph (Undirected): There is a path between any two vertices in the graph.
  • Disconnected Graph: A graph composed of multiple "subgraphs" (components) that are not connected to each other.

Representing Graphs in C

How do we store the vertices and edges in memory? There are two primary methods:

1. Adjacency Matrix

  • Concept: A 2D array adjMatrix[V][V], where V is the number of vertices (often indexed 0 to V-1).
    • For unweighted graphs: adjMatrix[i][j] = 1 if there's an edge from vertex i to j, else 0.
    • For weighted graphs: adjMatrix[i][j] = weight if there's an edge, else 0, infinity, or some special marker.
    • For undirected graphs, the matrix is symmetric (adjMatrix[i][j] == adjMatrix[j][i]).
  • Structure (Example for Unweighted):
    #include <stdio.h>
    #include <stdlib.h>
    
    typedef struct {
        int numVertices;
        int** adjMatrix; // Pointer to a dynamically allocated 2D array
    } GraphMatrix;
    
    // Function to create a graph represented by adjacency matrix
    GraphMatrix* createGraphMatrix(int vertices) {
        GraphMatrix* graph = (GraphMatrix*)malloc(sizeof(GraphMatrix));
        if (!graph) return NULL;
        graph->numVertices = vertices;
    
        // Allocate rows (array of pointers)
        graph->adjMatrix = (int**)malloc(vertices * sizeof(int*));
        if (!graph->adjMatrix) { free(graph); return NULL; }
    
        // Allocate columns for each row and initialize to 0
        for (int i = 0; i < vertices; i++) {
            graph->adjMatrix[i] = (int*)calloc(vertices, sizeof(int)); // calloc initializes to 0
            if (!graph->adjMatrix[i]) {
                // Cleanup allocated memory on failure
                while (--i >= 0) free(graph->adjMatrix[i]);
                free(graph->adjMatrix);
                free(graph);
                return NULL;
            }
        }
        return graph;
    }
    
    // Function to add an edge (undirected, unweighted)
    void addEdgeMatrix(GraphMatrix* graph, int src, int dest) {
        if (src >= 0 && src < graph->numVertices && dest >= 0 && dest < graph->numVertices) {
            graph->adjMatrix[src][dest] = 1;
            graph->adjMatrix[dest][src] = 1; // For undirected graph
        } else {
            fprintf(stderr, "Invalid vertex index for edge (%d, %d)\n", src, dest);
        }
    }
    
     // Function to free graph memory
    void freeGraphMatrix(GraphMatrix* graph) {
        if (!graph) return;
        for (int i = 0; i < graph->numVertices; i++) {
            free(graph->adjMatrix[i]); // Free columns
        }
        free(graph->adjMatrix); // Free rows
        free(graph); // Free graph structure
    }
    
  • Pros: Very fast O(1) lookup to check if an edge exists between two specific vertices. Simpler concept for dense graphs (where |E| is close to |V|^2).
  • Cons: Requires O(V^2) space, regardless of the number of edges. Very inefficient for sparse graphs (most real-world graphs). Adding/removing vertices is costly. Iterating over all neighbors of a vertex takes O(V) time.

2. Adjacency List

  • Concept: An array (or dynamic array/vector) of linked lists. The array index corresponds to a vertex (0 to V-1). adjList[i] points to the head of a linked list containing all vertices adjacent to vertex i.
  • Structure (Example for Unweighted):
    #include <stdio.h>
    #include <stdlib.h>
    
    // Node structure for the adjacency list
    typedef struct AdjListNode {
        int dest;                // Destination vertex index
        // For weighted graphs, add: int weight;
        struct AdjListNode* next; // Pointer to the next adjacent vertex
    } AdjListNode;
    
    // Structure to represent the graph itself
    typedef struct {
        int numVertices;
        AdjListNode** adjLists; // An array of pointers to AdjListNode (heads of lists)
        // Add bool is_directed; if needed
    } GraphList;
    
    // Function to create a new adjacency list node
    AdjListNode* newAdjListNode(int dest) {
        AdjListNode* newNode = (AdjListNode*)malloc(sizeof(AdjListNode));
         if (!newNode) { perror("Node alloc failed"); return NULL; }
        newNode->dest = dest;
        newNode->next = NULL;
        // Initialize weight if using weighted graph
        return newNode;
    }
    
    // Function to create a graph with V vertices
    GraphList* createGraphList(int vertices) {
        GraphList* graph = (GraphList*)malloc(sizeof(GraphList));
        if (!graph) { perror("Graph alloc failed"); return NULL; }
        graph->numVertices = vertices;
    
        // Create an array of adjacency lists (pointers)
        graph->adjLists = (AdjListNode**)malloc(vertices * sizeof(AdjListNode*));
        if (!graph->adjLists) { perror("AdjList array alloc failed"); free(graph); return NULL; }
    
    
        // Initialize each adjacency list as empty (NULL)
        for (int i = 0; i < vertices; ++i) {
            graph->adjLists[i] = NULL;
        }
    
        return graph;
    }
    
    // Function to add an edge to an undirected graph
    void addEdgeList(GraphList* graph, int src, int dest) {
         if (src < 0 || src >= graph->numVertices || dest < 0 || dest >= graph->numVertices) {
             fprintf(stderr, "Invalid vertex index for edge (%d, %d)\n", src, dest);
             return;
         }
        // Add an edge from src to dest. New node is added to the front of the list.
        AdjListNode* newNodeDest = newAdjListNode(dest);
         if (!newNodeDest) return; // Allocation check
        newNodeDest->next = graph->adjLists[src]; // Point new node to current head
        graph->adjLists[src] = newNodeDest;      // Update head
    
        // Since graph is undirected, add an edge from dest to src as well
        AdjListNode* newNodeSrc = newAdjListNode(src);
         if (!newNodeSrc) return; // Allocation check (tricky if first alloc succeeded!)
                                 // Consider a more robust error handling strategy
        newNodeSrc->next = graph->adjLists[dest];
        graph->adjLists[dest] = newNodeSrc;
    
        // For directed graph, only add the first part (src to dest)
    }
    
    // Function to free graph memory
    void freeGraphList(GraphList* graph) {
        if (!graph) return;
        for (int i = 0; i < graph->numVertices; ++i) {
            AdjListNode* current = graph->adjLists[i];
            AdjListNode* nextNode;
            while (current != NULL) {
                nextNode = current->next;
                free(current);
                current = nextNode;
            }
        }
        free(graph->adjLists); // Free the array of pointers
        free(graph);          // Free the graph structure
    }
    
    // Helper to print the graph (for debugging)
    void printGraphList(GraphList* graph) {
        printf("Graph Adjacency Lists:\n");
        for (int v = 0; v < graph->numVertices; ++v) {
            AdjListNode* temp = graph->adjLists[v];
            printf("Vertex %d: ", v);
            while (temp) {
                printf("-> %d ", temp->dest);
                temp = temp->next;
            }
            printf("-> NULL\n");
        }
    }
    
    
  • Pros: Space efficient for sparse graphs O(V + E). Adding edges is efficient. Iterating over neighbors of a vertex i is efficient (O(degree(i))). Generally preferred for most applications.
  • Cons: Checking for a specific edge between u and v can take time proportional to the degree of u.

Which to choose? Generally, Adjacency List is preferred unless the graph is very dense or you need constant-time edge existence checks frequently. We'll use Adjacency Lists for the traversal algorithms below.

Graph Traversal Algorithms

Traversal means systematically visiting all reachable vertices from a starting vertex. Because graphs can have cycles, we must keep track of visited vertices to avoid infinite loops.

The visited Array: We typically use a boolean array visited[V], initialized to false. Before visiting a vertex, we check if visited[vertex] is true. If not, we mark it true and proceed.

1. Depth-First Search (DFS)

  • Concept: Explore as far down a path as possible before backtracking. If you hit a dead end or an already visited node, you backtrack and try another path. Uses a stack implicitly (via recursion) or explicitly.

  • Implementation (Recursive, Adjacency List):

    #include <stdbool.h> // For bool type
    
    // Recursive DFS helper function
    void DFSUtil(GraphList* graph, int vertex, bool visited[]) {
        // Mark the current node as visited and print it
        visited[vertex] = true;
        printf("%d ", vertex);
    
        // Recur for all adjacent vertices
        AdjListNode* temp = graph->adjLists[vertex];
        while (temp != NULL) {
            int adjacentVertex = temp->dest;
            if (!visited[adjacentVertex]) {
                DFSUtil(graph, adjacentVertex, visited); // Recursive call
            }
            temp = temp->next;
        }
    }
    
    // Main DFS function - handles disconnected graphs
    void DFS(GraphList* graph, int startVertex) {
        if (!graph || startVertex < 0 || startVertex >= graph->numVertices) {
             fprintf(stderr, "Invalid input for DFS\n");
             return;
         }
        printf("DFS starting from vertex %d: ", startVertex);
    
        // Allocate and initialize visited array
        bool* visited = (bool*)calloc(graph->numVertices, sizeof(bool)); // calloc initializes to false
        if (!visited) { perror("Visited array alloc failed"); return; }
    
    
        // Call the recursive helper function for the starting vertex
        DFSUtil(graph, startVertex, visited);
    
        // Optional: If the graph might be disconnected, iterate through all vertices
        // to ensure all components are visited.
        // printf("\nChecking other components (if any):\n");
        // for (int i = 0; i < graph->numVertices; i++) {
        //     if (!visited[i]) {
        //         printf("Component starting at %d: ", i);
        //         DFSUtil(graph, i, visited);
        //         printf("\n");
        //     }
        // }
    
        printf("\n");
        free(visited); // Clean up visited array
    }
    
  • Applications: Finding paths, cycle detection, topological sorting (for Directed Acyclic Graphs - DAGs), finding connected components.

2. Breadth-First Search (BFS)

  • Concept: Explore vertex neighbours level by level. Visit the starting node, then all its direct neighbors, then all their neighbors, and so on. Uses a Queue.

  • Implementation (Iterative using Queue, Adjacency List): Requires a Queue implementation (similar to the one used in the Trees post, storing integer vertex indices).

    #include <stdbool.h>
    
    // --- Minimal Queue Implementation for BFS (stores int vertex indices) ---
    // Replace with your actual Queue code from previous posts if available
    typedef struct IntQueueNode {
        int data;
        struct IntQueueNode* next;
    } IntQueueNode;
    
    typedef struct {
        IntQueueNode *front, *rear;
    } IntQueue;
    
    IntQueue* createIntQueue() { /* Implementation needed */ }
    void enqueueInt(IntQueue* q, int item) { /* Implementation needed */ }
    int dequeueInt(IntQueue* q) { /* Implementation needed - return -1 or error on empty */ }
    bool isIntQueueEmpty(IntQueue* q) { /* Implementation needed */ }
    void freeIntQueue(IntQueue* q) { /* Implementation needed */ }
    // --- End Minimal Queue ---
    
    // --- Placeholder Queue Implementation ---
    IntQueue* createIntQueue() {
        IntQueue* q = (IntQueue*)malloc(sizeof(IntQueue));
         if (!q) return NULL;
        q->front = q->rear = NULL;
        return q;
    }
     bool isIntQueueEmpty(IntQueue* q) {
        return q->front == NULL;
    }
     void enqueueInt(IntQueue* q, int item) {
        IntQueueNode* temp = (IntQueueNode*)malloc(sizeof(IntQueueNode));
        if (!temp) return;
        temp->data = item;
        temp->next = NULL;
        if (q->rear == NULL) { q->front = q->rear = temp; return; }
        q->rear->next = temp;
        q->rear = temp;
    }
     int dequeueInt(IntQueue* q) {
        if (isIntQueueEmpty(q)) return -1; // Error indicator
        IntQueueNode* temp = q->front;
        int item = temp->data;
        q->front = q->front->next;
        if (q->front == NULL) q->rear = NULL;
        free(temp);
        return item;
    }
    void freeIntQueue(IntQueue* q) {
        while (!isIntQueueEmpty(q)) { dequeueInt(q); }
        free(q);
    }
     // --- End Placeholder Queue Implementation ---
    
    
    // Main BFS function
    void BFS(GraphList* graph, int startVertex) {
         if (!graph || startVertex < 0 || startVertex >= graph->numVertices) {
             fprintf(stderr, "Invalid input for BFS\n");
             return;
         }
        printf("BFS starting from vertex %d: ", startVertex);
    
        // Allocate and initialize visited array
        bool* visited = (bool*)calloc(graph->numVertices, sizeof(bool));
        if (!visited) { perror("Visited array alloc failed"); return; }
    
    
        // Create a queue for BFS
        IntQueue* queue = createIntQueue();
        if (!queue) { free(visited); perror("Queue creation failed"); return; }
    
        // Mark the start node as visited and enqueue it
        visited[startVertex] = true;
        enqueueInt(queue, startVertex);
    
        while (!isIntQueueEmpty(queue)) {
            // Dequeue a vertex and print it
            int currentVertex = dequeueInt(queue);
            if (currentVertex == -1) break; // Should not happen if logic is correct
            printf("%d ", currentVertex);
    
            // Get all adjacent vertices of the dequeued vertex
            AdjListNode* temp = graph->adjLists[currentVertex];
            while (temp != NULL) {
                int adjVertex = temp->dest;
                // If an adjacent has not been visited, mark it visited and enqueue it
                if (!visited[adjVertex]) {
                    visited[adjVertex] = true;
                    enqueueInt(queue, adjVertex);
                }
                temp = temp->next;
            }
        }
    
        printf("\n");
        freeIntQueue(queue); // Clean up queue
        free(visited);      // Clean up visited array
    
        // BFS inherently explores only the connected component of startVertex.
        // If you need to explore disconnected graphs, call BFS multiple times
        // starting from unvisited nodes.
    }
    
  • Applications: Finding the shortest path in unweighted graphs, finding connected components, network broadcasting algorithms (like finding reachable nodes), web crawlers.

Beyond Traversal: Other Graph Algorithms

DFS and BFS are the building blocks for many more advanced graph algorithms, including:

  • Shortest Path Algorithms:
    • Dijkstra's Algorithm: Finds the shortest path between nodes in a weighted graph (with non-negative weights).
    • Bellman-Ford Algorithm: Handles graphs with negative edge weights (and detects negative cycles).
  • Minimum Spanning Tree (MST) Algorithms: Finds a subgraph that connects all vertices with the minimum possible total edge weight.
    • Prim's Algorithm
    • Kruskal's Algorithm
  • Topological Sort: Linear ordering of vertices in a Directed Acyclic Graph (DAG) such that for every directed edge from vertex u to vertex v, u comes before v in the ordering. Used for scheduling tasks with dependencies.

These are typically covered in more advanced algorithm courses but rely on the graph representations and traversal concepts discussed here.

Example Usage (main)

// (Include graph structures, create/add/free functions, DFS, BFS,
// and necessary Queue functions above)

int main() {
    int V = 6; // Number of vertices
    GraphList* graph = createGraphList(V);
    if (!graph) return 1;

    printf("Creating an undirected graph with %d vertices.\n", V);
    addEdgeList(graph, 0, 1);
    addEdgeList(graph, 0, 2);
    addEdgeList(graph, 1, 3);
    addEdgeList(graph, 1, 4);
    addEdgeList(graph, 2, 4);
    addEdgeList(graph, 3, 4);
    addEdgeList(graph, 3, 5);
    addEdgeList(graph, 4, 5);

    // Print the adjacency list representation
    printGraphList(graph);
    printf("\n");

    // Perform DFS starting from vertex 0
    DFS(graph, 0);

    // Perform BFS starting from vertex 0
    BFS(graph, 0);

    // Perform DFS starting from vertex 3
    DFS(graph, 3);

    // Perform BFS starting from vertex 3
    BFS(graph, 3);


    printf("\nFreeing the graph...\n");
    freeGraphList(graph);
    graph = NULL;
    printf("Graph freed.\n");

    return 0;
}

Conclusion

Graphs provide a powerful and flexible way to model complex relationships and networks in C. We've seen how to represent them using Adjacency Matrices and the more commonly used Adjacency Lists. Mastering the fundamental traversal algorithms, Depth-First Search (DFS) and Breadth-First Search (BFS), along with the crucial use of a visited marker, unlocks the ability to explore graphs and solve many problems. These traversals are the foundation upon which more complex algorithms for shortest paths, minimum spanning trees, and topological sorting are built.

With graphs added to your C toolkit, you're equipped to tackle a wide range of interconnected problems! What's next in our intermediate journey? Perhaps a closer look at low-level Bitwise Operations, the power of Function Pointers, or robust Error Handling techniques in C.