Multi-threading with POSIX Threads (pthreads) Part 1 of 6 Advanced C Topics

Ready to take your C programming skills to the next level? Learn how to make your programs do multiple things simultaneously, boosting performance and responsiveness, especially on multi-core processors. This guide will walk you through the essentials of using pthreads in your C applications using Multi-threading with POSIX Threads (pthreads).

Table of Contents

What is Multi-threading?
Introduction to POSIX Threads (pthreads)
Creating Threads with pthread_create
Waiting for Threads with pthread_join
Synchronization: The Problem of Shared Resources
Using Mutexes (pthread_mutex_t)
Other Synchronization Primitives
Potential Pitfalls and Considerations
Conclusion

What is Multi-threading?

Imagine your program as a single worker tackling a large project step-by-step. Multi-threading is like hiring multiple workers (threads) who can work on different parts of the project concurrently within the same workspace (your program's process). Each thread has its own flow of execution but shares the same memory space (global variables, heap memory) with other threads in the process.

Benefits of Multi-threading:

Parallelism: On multi-core CPUs, different threads can run truly simultaneously on different cores, significantly speeding up CPU-bound tasks.
Responsiveness: In applications like GUI programs or servers, one thread can handle user interaction or network requests while other threads perform background tasks, preventing the application from freezing.
Resource Sharing: Threads within the same process share memory and resources, making communication between them relatively efficient compared to inter-process communication (IPC).
Efficiency: Creating and switching between threads is generally less resource-intensive than creating and managing separate processes.

Introduction to POSIX Threads (pthreads)

POSIX Threads, commonly known as pthreads, is a standardized C language programming interface (API) for creating and managing threads. It's widely available on Unix-like operating systems (Linux, macOS, Solaris, etc.).

To use pthreads, you need to:

Include the header file:
```
#include <pthread.h>
```

Link the pthreads library: When compiling, you usually need to add the -lpthread or -pthread flag:

gcc your_program.c -o your_program -lpthread
# or
gcc your_program.c -o your_program -pthread

Creating Threads with `pthread_create`

The core function for creating a new thread is pthread_create.

#include <pthread.h>

int pthread_create(pthread_t *thread,
                   const pthread_attr_t *attr,
                   void *(*start_routine) (void *),
                   void *arg);

Let's break down the arguments:

pthread_t *thread: A pointer to a pthread_t variable. The ID of the newly created thread will be stored here upon successful creation.
const pthread_attr_t *attr: Pointer to thread attributes (e.g., stack size, scheduling policy). Passing NULL uses default attributes, which is common for basic usage.
void *(*start_routine) (void *): This is the crucial part – a pointer to the function the new thread will execute. This function must take a void * as an argument and return a void *.
void *arg: The argument to be passed to the start_routine function. If you need to pass multiple arguments, you'll typically wrap them in a struct.

Return Value: pthread_create returns 0 on success and an error number on failure.

Simple Example:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h> // For sleep()

// Function that will be executed by the new thread
void *print_message_function(void *ptr) {
    char *message;
    message = (char *) ptr;
    printf("%s \n", message);
    sleep(1); // Simulate some work
    printf("Thread finished its work.\n");
    return NULL; // Thread exits
}

int main() {
    pthread_t thread1; // Thread identifier
    const char *message1 = "Hello from Thread 1!";
    int iret1;

    printf("Main: Creating Thread 1...\n");
    // Create the first thread, passing message1 as argument
    iret1 = pthread_create(&thread1, NULL, print_message_function, (void*) message1);

    if(iret1) {
        fprintf(stderr, "Error - pthread_create() return code: %d\n", iret1);
        exit(EXIT_FAILURE);
    }

    printf("Main: Thread 1 created successfully.\n");

    // Main thread continues executing...
    printf("Main: Doing some other work...\n");
    sleep(2); // Let the thread run for a bit

    printf("Main: Program finished.\n"); // Note: Main might finish before the thread!
                                       // We need pthread_join for proper waiting.

    // exit(EXIT_SUCCESS); // Exit immediately - might kill the thread prematurely
    pthread_exit(NULL); // Better way for main to exit and let other threads continue
                       // until they finish, but join is usually preferred for waiting.
}

Waiting for Threads with `pthread_join`

Often, the main thread needs to wait for other threads to complete their execution before proceeding (e.g., to collect results or ensure cleanup). This is done using pthread_join.

#include <pthread.h>

int pthread_join(pthread_t thread, void **retval);

pthread_t thread: The ID of the thread to wait for.
void **retval: A pointer to a void *. If the joined thread returned a value (using return or pthread_exit), a pointer to that value will be stored here. If you don't care about the return value, pass NULL.

Return Value: pthread_join returns 0 on success and an error number on failure. A common error is ESRCH if no thread with the given ID exists.

Example using pthread_join:

Let's modify the previous example to wait for the thread.

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

void *print_message_function(void *ptr) {
    char *message;
    message = (char *) ptr;
    printf("Thread: Received message: %s \n", message);
    sleep(1);
    printf("Thread: Work finished.\n");
    // Example of returning a value (can be more complex, e.g., struct*)
    long thread_result = 42;
    return (void*) thread_result;
}

int main() {
    pthread_t thread1;
    const char *message1 = "Work for Thread 1";
    int iret1;
    void *thread_return_value;

    printf("Main: Creating Thread 1...\n");
    iret1 = pthread_create(&thread1, NULL, print_message_function, (void*) message1);
    if(iret1) {
        fprintf(stderr, "Error - pthread_create() return code: %d\n", iret1);
        exit(EXIT_FAILURE);
    }
    printf("Main: Thread 1 created. ID: %lu\n", (unsigned long)thread1);

    // *** Wait for thread1 to complete ***
    printf("Main: Waiting for Thread 1 to finish...\n");
    iret1 = pthread_join(thread1, &thread_return_value);
     if(iret1) {
        fprintf(stderr, "Error - pthread_join() return code: %d\n", iret1);
        exit(EXIT_FAILURE);
    }

    printf("Main: Thread 1 finished and joined.\n");
    printf("Main: Thread 1 returned value: %ld\n", (long)thread_return_value);

    printf("Main: Program finished successfully.\n");
    exit(EXIT_SUCCESS);
}

Synchronization: The Problem of Shared Resources

When multiple threads access and modify shared data concurrently, you can run into problems called race conditions. Imagine two threads trying to increment the same global counter:

Thread A reads the counter value (e.g., 5).
Thread B reads the counter value (e.g., 5).
Thread A calculates the new value (5 + 1 = 6).
Thread B calculates the new value (5 + 1 = 6).
Thread A writes the new value (6) back to the counter.
Thread B writes the new value (6) back to the counter.

Even though the counter was incremented twice, the final value is 6, not the expected 7. This happens because the read-modify-write operation is not atomic (indivisible).

To prevent race conditions, we need synchronization mechanisms. The most common one is the mutex (Mutual Exclusion).

Using Mutexes (`pthread_mutex_t`)

A mutex acts like a lock. Only one thread can "hold" the mutex at any given time. If a thread wants to access a shared resource, it must first acquire the mutex lock. If the lock is already held by another thread, the requesting thread will block (wait) until the lock is released.

Key Mutex Functions:

Initialization:

pthread_mutex_t my_mutex;
int ret = pthread_mutex_init(&my_mutex, NULL); // NULL for default attributes
// Or static initialization:
// pthread_mutex_t my_mutex = PTHREAD_MUTEX_INITIALIZER;

Locking:

int ret = pthread_mutex_lock(&my_mutex); // Blocks if mutex is locked
// Access shared resource here...

Unlocking:

int ret = pthread_mutex_unlock(&my_mutex); // Releases the lock

Destroying: (Release resources associated with the mutex)

int ret = pthread_mutex_destroy(&my_mutex); // Should be done when mutex is no longer needed

Example: Protecting a Shared Counter

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

#define NUM_THREADS 5
#define ITERATIONS 1000000

long long shared_counter = 0; // The shared resource
pthread_mutex_t counter_mutex; // Mutex to protect the counter

void *increment_counter(void *arg) {
    int thread_id = *((int*)arg); // Get thread ID passed as argument
    printf("Thread %d starting...\n", thread_id);

    for (int i = 0; i < ITERATIONS; ++i) {
        // --- Critical Section Start ---
        pthread_mutex_lock(&counter_mutex);

        shared_counter++; // Access shared resource safely

        pthread_mutex_unlock(&counter_mutex);
        // --- Critical Section End ---
    }

    printf("Thread %d finished.\n", thread_id);
    return NULL;
}

int main() {
    pthread_t threads[NUM_THREADS];
    int thread_ids[NUM_THREADS];
    int ret;

    // Initialize the mutex
    ret = pthread_mutex_init(&counter_mutex, NULL);
    if (ret != 0) {
        perror("Mutex initialization failed");
        exit(EXIT_FAILURE);
    }
    printf("Mutex initialized.\n");

    printf("Creating %d threads...\n", NUM_THREADS);
    for (int i = 0; i < NUM_THREADS; ++i) {
        thread_ids[i] = i + 1; // Assign unique ID (1 to NUM_THREADS)
        ret = pthread_create(&threads[i], NULL, increment_counter, &thread_ids[i]);
        if (ret) {
            fprintf(stderr, "Error creating thread %d: %d\n", i + 1, ret);
            exit(EXIT_FAILURE);
        }
    }

    printf("Waiting for threads to complete...\n");
    for (int i = 0; i < NUM_THREADS; ++i) {
        pthread_join(threads[i], NULL);
    }

    // Destroy the mutex
    pthread_mutex_destroy(&counter_mutex);
    printf("Mutex destroyed.\n");

    // Calculate expected value
    long long expected_value = (long long)NUM_THREADS * ITERATIONS;

    printf("\nAll threads finished.\n");
    printf("Final counter value: %lld\n", shared_counter);
    printf("Expected counter value: %lld\n", expected_value);

    if (shared_counter == expected_value) {
        printf("Success! The counter value is correct.\n");
    } else {
        printf("Error! Race condition likely occurred (or other issue).\n");
        printf("Difference: %lld\n", expected_value - shared_counter);
    }


    exit(EXIT_SUCCESS);
}

Compile and Run:

gcc multithread_counter.c -o multithread_counter -lpthread
./multithread_counter

Try running the counter example without the pthread_mutex_lock and pthread_mutex_unlock calls. You'll likely see that the final shared_counter value is less than the expected value due to race conditions.

Other Synchronization Primitives

While mutexes are fundamental, pthreads offers other synchronization tools for more complex scenarios:

Condition Variables (pthread_cond_t): Allow threads to wait efficiently for a specific condition to become true. Used in conjunction with mutexes. Key functions: pthread_cond_wait, pthread_cond_signal, pthread_cond_broadcast. Useful for producer-consumer problems.
Semaphores (semaphore.h): Although not strictly part of the core pthreads API, POSIX semaphores are often used with threads for controlling access to a resource pool with multiple units. Key functions: sem_init, sem_wait, sem_post, sem_destroy. Requires linking with -lpthread or sometimes -lrt.
Read-Write Locks (pthread_rwlock_t): Allow multiple threads to read a resource concurrently but require exclusive access for writing. Can improve performance if reads are much more frequent than writes. Key functions: pthread_rwlock_rdlock, pthread_rwlock_wrlock, pthread_rwlock_unlock.

Potential Pitfalls and Considerations

Multi-threaded programming is powerful but introduces complexity:

Deadlocks: Occur when two or more threads are blocked forever, each waiting for a resource held by the other. Example: Thread A locks Mutex 1, then tries to lock Mutex 2. Thread B locks Mutex 2, then tries to lock Mutex 1. Careful lock ordering is crucial.
Complexity: Debugging multi-threaded programs is significantly harder than single-threaded ones due to non-deterministic execution order. Race conditions and deadlocks might only appear under specific timing conditions. Tools like GDB (with thread support) and specialized debuggers (Helgrind, ThreadSanitizer) are invaluable.
Overhead: Creating and synchronizing threads has overhead. For very simple tasks, the overhead might outweigh the benefits of parallelism.
False Sharing: On multi-core systems with caches, if threads on different cores frequently modify variables that happen to reside on the same cache line, it can cause excessive cache invalidations, hurting performance even if the variables themselves aren't directly shared logically.

Conclusion

Multi-threading with POSIX Threads opens up possibilities for creating high-performance, responsive C applications. By understanding thread creation (pthread_create), waiting (pthread_join), and synchronization using mutexes (pthread_mutex_t), you can harness the power of concurrency.

Remember that with great power comes great responsibility. Writing correct, efficient, and robust multi-threaded code requires careful design, attention to synchronization details, and thorough testing to avoid pitfalls like race conditions and deadlocks. This introduction provides a solid foundation for exploring more advanced threading concepts and building powerful concurrent applications in C.

Suggested Reading:

(Link to Debugging Article): Learn about Debugging C Programs Effectively with GDB (essential for multi-threaded code!).
(Link to Network Programming): Explore how threading is used in Network Programming in C using Sockets.
(Link to Advanced C Topics): For more advanced C topics, check out our Advanced C Programming Series.