Thread Synchronization: Mutexes and Locks

Master C++ thread synchronization — learn std::mutex, lock_guard, unique_lock, shared_mutex, deadlock prevention, and RAII locking patterns for safe concurrent code.

By Techietory on May 10, 2026

Thread Synchronization: Mutexes and Locks

A mutex (mutual exclusion) in C++ is a synchronization primitive that prevents multiple threads from simultaneously accessing shared data. std::mutex provides lock() and unlock() methods, but in practice you should always use RAII wrappers: std::lock_guard (scoped, non-transferable lock), std::unique_lock (flexible, movable lock with try-lock and timed-lock support), or std::scoped_lock (C++17, locks multiple mutexes atomically). These wrappers guarantee the mutex is released even when exceptions occur.

Introduction

In the previous article, you saw how data races occur: when two threads simultaneously access the same memory location and at least one is writing, the result is undefined behavior. The counter that should reach 1,000,000 ends up at 724,531. A data structure that should contain 1,000 elements ends up corrupted. A program that should run correctly crashes or produces nonsense — and the bug may only manifest occasionally, making it extraordinarily difficult to reproduce and debug.

The fundamental solution to data races is mutual exclusion: ensuring that only one thread at a time can access a particular piece of shared data. C++ provides this through mutexes — from the Latin mutual exclusion. A mutex is a lock. Before accessing shared data, a thread acquires (locks) the mutex. After it is done, it releases (unlocks) it. Any other thread that tries to acquire the mutex while it is held will block — wait — until the holder releases it.

But raw mutex lock/unlock calls have the same problem as raw memory management: you can forget to unlock, you can unlock in the wrong order, and exceptions can bypass your unlock calls entirely. C++ solves this with RAII lock wrappers — objects whose constructors lock a mutex and whose destructors unlock it, guaranteeing correct behavior regardless of how the scope exits.

This article teaches you everything you need to know about mutexes and locks in C++. You will understand why raw lock/unlock is dangerous, master the RAII lock types (lock_guard, unique_lock, scoped_lock), learn to detect and prevent deadlocks, understand reader-writer locking with shared_mutex, and see how to design thread-safe classes correctly.

The Problem: Unsafe Shared State

Let’s revisit the unsafe counter from the previous article and see exactly why it fails, then fix it step by step.

C++

#include <iostream>
#include <thread>
#include <vector>
using namespace std;

// Step 1: The broken version — data race
int counter = 0;

void incrementBroken(int times) {
    for (int i = 0; i < times; i++) {
        counter++;  // NOT SAFE: read-modify-write is not atomic
    }
}

// Step 2: Manual mutex — better but still dangerous
#include <mutex>
mutex mtx;

void incrementManual(int times) {
    for (int i = 0; i < times; i++) {
        mtx.lock();     // Acquire the mutex
        counter++;      // Protected: only one thread at a time
        mtx.unlock();   // Release the mutex
    }
}

// What if incrementManual throws between lock and unlock?
// The mutex is never released — permanent deadlock for all other threads.

int main() {
    const int numThreads = 10;
    const int timesEach  = 100000;
    const int expected   = numThreads * timesEach;

    // --- Broken version ---
    counter = 0;
    {
        vector<thread> threads;
        for (int i = 0; i < numThreads; i++)
            threads.emplace_back(incrementBroken, timesEach);
        for (auto& t : threads) t.join();
    }
    cout << "Broken: " << counter << " (expected " << expected << ")" << endl;

    // --- Manual mutex version ---
    counter = 0;
    {
        vector<thread> threads;
        for (int i = 0; i < numThreads; i++)
            threads.emplace_back(incrementManual, timesEach);
        for (auto& t : threads) t.join();
    }
    cout << "Manual: " << counter << " (expected " << expected << ")" << endl;

    return 0;
}

#include <iostream>
#include <thread>
#include <vector>
using namespace std;

// Step 1: The broken version — data race
int counter = 0;

void incrementBroken(int times) {
    for (int i = 0; i < times; i++) {
        counter++;  // NOT SAFE: read-modify-write is not atomic
    }
}

// Step 2: Manual mutex — better but still dangerous
#include <mutex>
mutex mtx;

void incrementManual(int times) {
    for (int i = 0; i < times; i++) {
        mtx.lock();     // Acquire the mutex
        counter++;      // Protected: only one thread at a time
        mtx.unlock();   // Release the mutex
    }
}

// What if incrementManual throws between lock and unlock?
// The mutex is never released — permanent deadlock for all other threads.

int main() {
    const int numThreads = 10;
    const int timesEach  = 100000;
    const int expected   = numThreads * timesEach;

    // --- Broken version ---
    counter = 0;
    {
        vector<thread> threads;
        for (int i = 0; i < numThreads; i++)
            threads.emplace_back(incrementBroken, timesEach);
        for (auto& t : threads) t.join();
    }
    cout << "Broken: " << counter << " (expected " << expected << ")" << endl;

    // --- Manual mutex version ---
    counter = 0;
    {
        vector<thread> threads;
        for (int i = 0; i < numThreads; i++)
            threads.emplace_back(incrementManual, timesEach);
        for (auto& t : threads) t.join();
    }
    cout << "Manual: " << counter << " (expected " << expected << ")" << endl;

    return 0;
}

Output:

Plaintext

Broken: 731204 (expected 1000000)
Manual: 1000000 (expected 1000000)

Broken: 731204 (expected 1000000)
Manual: 1000000 (expected 1000000)

Step-by-step explanation:

counter++ without a mutex is a data race. The read-modify-write sequence is not atomic — two threads can read the same value, both add 1, and both write back the same incremented value, losing one update.
mtx.lock() and mtx.unlock() fix the race — only one thread can be inside the critical section at a time. All other threads block at mtx.lock() until the holder calls mtx.unlock().
But manual lock/unlock has a critical flaw: if anything between lock() and unlock() throws an exception, unlock() is never called. Every subsequent call to lock() will block forever — a deadlock. In this simple example that risk is low, but in real code with complex logic in the critical section, it is very real.
The RAII lock wrappers in the next section solve this problem the same way unique_ptr solves memory management: put the cleanup in a destructor so it always happens.

std::lock_guard: The Simplest RAII Lock

std::lock_guard<Mutex> is the simplest RAII lock. Its constructor calls mutex.lock() and its destructor calls mutex.unlock(). It cannot be copied or moved — it is tied to its scope.

C++

#include <iostream>
#include <thread>
#include <mutex>
#include <vector>
using namespace std;

mutex mtx;
int counter = 0;

void incrementSafe(int times) {
    for (int i = 0; i < times; i++) {
        lock_guard<mutex> guard(mtx);  // Locks mtx in constructor
        counter++;                      // Protected critical section
        // guard's destructor unlocks mtx — ALWAYS, even on exception
    }
}

// Safe function with complex logic and early returns
void processData(vector<int>& shared, int value) {
    lock_guard<mutex> guard(mtx);  // Lock once on entry

    if (shared.empty()) {
        cout << "No data to process" << endl;
        return;  // guard destructor runs — mutex unlocked
    }

    shared.push_back(value);

    if (value < 0) {
        cout << "Invalid value, aborting" << endl;
        return;  // guard destructor runs — mutex unlocked
    }

    cout << "Added " << value << ", size=" << shared.size() << endl;
    // Guard destructs here too — mutex unlocked
}

int main() {
    const int numThreads = 10;
    const int timesEach  = 100000;

    counter = 0;
    vector<thread> threads;
    for (int i = 0; i < numThreads; i++)
        threads.emplace_back(incrementSafe, timesEach);
    for (auto& t : threads) t.join();

    cout << "Counter: " << counter
         << " (expected " << numThreads * timesEach << ")" << endl;

    // processData with multiple exit paths — all safe
    vector<int> data;
    processData(data, 10);    // Empty: won't add
    data.push_back(0);        // Seed it
    processData(data, 42);    // Normal add
    processData(data, -1);    // Invalid value

    return 0;
}

#include <iostream>
#include <thread>
#include <mutex>
#include <vector>
using namespace std;

mutex mtx;
int counter = 0;

void incrementSafe(int times) {
    for (int i = 0; i < times; i++) {
        lock_guard<mutex> guard(mtx);  // Locks mtx in constructor
        counter++;                      // Protected critical section
        // guard's destructor unlocks mtx — ALWAYS, even on exception
    }
}

// Safe function with complex logic and early returns
void processData(vector<int>& shared, int value) {
    lock_guard<mutex> guard(mtx);  // Lock once on entry

    if (shared.empty()) {
        cout << "No data to process" << endl;
        return;  // guard destructor runs — mutex unlocked
    }

    shared.push_back(value);

    if (value < 0) {
        cout << "Invalid value, aborting" << endl;
        return;  // guard destructor runs — mutex unlocked
    }

    cout << "Added " << value << ", size=" << shared.size() << endl;
    // Guard destructs here too — mutex unlocked
}

int main() {
    const int numThreads = 10;
    const int timesEach  = 100000;

    counter = 0;
    vector<thread> threads;
    for (int i = 0; i < numThreads; i++)
        threads.emplace_back(incrementSafe, timesEach);
    for (auto& t : threads) t.join();

    cout << "Counter: " << counter
         << " (expected " << numThreads * timesEach << ")" << endl;

    // processData with multiple exit paths — all safe
    vector<int> data;
    processData(data, 10);    // Empty: won't add
    data.push_back(0);        // Seed it
    processData(data, 42);    // Normal add
    processData(data, -1);    // Invalid value

    return 0;
}

Output:

Plaintext

Counter: 1000000 (expected 1000000)
No data to process
Added 42, size=2
Invalid value, aborting

Counter: 1000000 (expected 1000000)
No data to process
Added 42, size=2
Invalid value, aborting

Step-by-step explanation:

lock_guard<mutex> guard(mtx) constructs a lock guard that immediately calls mtx.lock(). The mutex stays locked for the entire lifetime of guard.
When guard goes out of scope — whether at the end of the for loop body, at an early return, or because an exception is thrown — its destructor calls mtx.unlock(). No matter how the scope exits, the mutex is released.
In processData, three different exit paths (the early return for empty data, the early return for negative values, and the normal fall-through at the end) all safely unlock the mutex. Without RAII, each return would need a manual mtx.unlock() call.
The critical section should be as small as possible. Holding a mutex for a long time means other threads are blocked for a long time. In incrementSafe, only counter++ is inside the lock — not the loop machinery.
lock_guard cannot be unlocked early. If you need to lock and unlock within a function at specific points, use unique_lock instead.

std::unique_lock: Flexible Locking

std::unique_lock<Mutex> is a more powerful but slightly heavier RAII lock. It supports everything lock_guard does, plus: deferred locking, try-locking, timed locking, explicit unlock before destructor, and moveability (used with condition variables).

C++

#include <iostream>
#include <thread>
#include <mutex>
#include <chrono>
#include <vector>
using namespace std;

mutex dataMtx;
vector<int> sharedData;

// Deferred locking: construct the lock without locking immediately
void deferredLockDemo() {
    unique_lock<mutex> lock(dataMtx, defer_lock);  // Not locked yet

    // Do some work without the lock
    int value = 42;  // Computation that doesn't need the lock

    lock.lock();  // Lock explicitly when needed
    sharedData.push_back(value);
    lock.unlock();  // Unlock explicitly

    // Do more work without the lock
    cout << "Deferred: added " << value << endl;

    // Lock again for another critical section
    lock.lock();
    sharedData.push_back(value * 2);
    // lock destructor will unlock at end of scope
}

// Try-locking: don't block, return false if can't acquire
void tryLockDemo(int id) {
    unique_lock<mutex> lock(dataMtx, try_to_lock);

    if (lock.owns_lock()) {
        // We got the lock
        sharedData.push_back(id);
        cout << "Thread " << id << ": acquired lock, added value" << endl;
        this_thread::sleep_for(chrono::milliseconds(50));
    } else {
        // Couldn't get it — do something else
        cout << "Thread " << id << ": couldn't acquire lock, skipping" << endl;
    }
}

// Timed locking: wait at most N milliseconds
void timedLockDemo(int id) {
    unique_lock<mutex> lock(dataMtx, chrono::milliseconds(100));

    if (lock.owns_lock()) {
        cout << "Thread " << id << ": acquired lock within timeout" << endl;
        sharedData.push_back(id * 10);
    } else {
        cout << "Thread " << id << ": timed out waiting for lock" << endl;
    }
}

int main() {
    cout << "=== Deferred lock ===" << endl;
    deferredLockDemo();
    cout << "Data size: " << sharedData.size() << "\n\n";

    cout << "=== Try-lock (two threads competing) ===" << endl;
    {
        thread t1(tryLockDemo, 1);
        thread t2(tryLockDemo, 2);
        t1.join(); t2.join();
    }
    cout << "Data size: " << sharedData.size() << "\n\n";

    cout << "=== Timed lock ===" << endl;
    // Pre-lock to make one thread time out
    dataMtx.lock();
    {
        thread t3(timedLockDemo, 3);  // Will time out — mutex is held
        this_thread::sleep_for(chrono::milliseconds(200));
        dataMtx.unlock();  // Release so t3 can proceed
        thread t4(timedLockDemo, 4);  // Will succeed
        t3.join(); t4.join();
    }

    return 0;
}

#include <iostream>
#include <thread>
#include <mutex>
#include <chrono>
#include <vector>
using namespace std;

mutex dataMtx;
vector<int> sharedData;

// Deferred locking: construct the lock without locking immediately
void deferredLockDemo() {
    unique_lock<mutex> lock(dataMtx, defer_lock);  // Not locked yet

    // Do some work without the lock
    int value = 42;  // Computation that doesn't need the lock

    lock.lock();  // Lock explicitly when needed
    sharedData.push_back(value);
    lock.unlock();  // Unlock explicitly

    // Do more work without the lock
    cout << "Deferred: added " << value << endl;

    // Lock again for another critical section
    lock.lock();
    sharedData.push_back(value * 2);
    // lock destructor will unlock at end of scope
}

// Try-locking: don't block, return false if can't acquire
void tryLockDemo(int id) {
    unique_lock<mutex> lock(dataMtx, try_to_lock);

    if (lock.owns_lock()) {
        // We got the lock
        sharedData.push_back(id);
        cout << "Thread " << id << ": acquired lock, added value" << endl;
        this_thread::sleep_for(chrono::milliseconds(50));
    } else {
        // Couldn't get it — do something else
        cout << "Thread " << id << ": couldn't acquire lock, skipping" << endl;
    }
}

// Timed locking: wait at most N milliseconds
void timedLockDemo(int id) {
    unique_lock<mutex> lock(dataMtx, chrono::milliseconds(100));

    if (lock.owns_lock()) {
        cout << "Thread " << id << ": acquired lock within timeout" << endl;
        sharedData.push_back(id * 10);
    } else {
        cout << "Thread " << id << ": timed out waiting for lock" << endl;
    }
}

int main() {
    cout << "=== Deferred lock ===" << endl;
    deferredLockDemo();
    cout << "Data size: " << sharedData.size() << "\n\n";

    cout << "=== Try-lock (two threads competing) ===" << endl;
    {
        thread t1(tryLockDemo, 1);
        thread t2(tryLockDemo, 2);
        t1.join(); t2.join();
    }
    cout << "Data size: " << sharedData.size() << "\n\n";

    cout << "=== Timed lock ===" << endl;
    // Pre-lock to make one thread time out
    dataMtx.lock();
    {
        thread t3(timedLockDemo, 3);  // Will time out — mutex is held
        this_thread::sleep_for(chrono::milliseconds(200));
        dataMtx.unlock();  // Release so t3 can proceed
        thread t4(timedLockDemo, 4);  // Will succeed
        t3.join(); t4.join();
    }

    return 0;
}

Output:

Plaintext

=== Deferred lock ===
Deferred: added 42
Data size: 2

=== Try-lock (two threads competing) ===
Thread 1: acquired lock, added value
Thread 2: couldn't acquire lock, skipping
Data size: 3

=== Timed lock ===
Thread 3: timed out waiting for lock
Thread 4: acquired lock within timeout

=== Deferred lock ===
Deferred: added 42
Data size: 2

=== Try-lock (two threads competing) ===
Thread 1: acquired lock, added value
Thread 2: couldn't acquire lock, skipping
Data size: 3

=== Timed lock ===
Thread 3: timed out waiting for lock
Thread 4: acquired lock within timeout

Step-by-step explanation:

unique_lock<mutex> lock(dataMtx, defer_lock) constructs the lock without locking. The mutex is not acquired until lock.lock() is called explicitly. This is useful when you need a lock object for a condition variable (see next article) or when the locking decision depends on runtime conditions.
unique_lock<mutex> lock(dataMtx, try_to_lock) attempts to acquire the mutex without blocking. lock.owns_lock() returns true if the acquisition succeeded, false if another thread holds it. Use this when you want to do useful work even if the lock is unavailable — non-blocking progress.
unique_lock<mutex> lock(dataMtx, chrono::milliseconds(100)) waits up to 100ms to acquire the mutex. If it cannot, lock.owns_lock() returns false. This requires std::timed_mutex for full support (here using regular mutex for simplicity in the demo).
lock.unlock() inside deferredLockDemo explicitly releases the mutex before the destructor, allowing other threads to proceed sooner. The destructor checks owns_lock() and only unlocks if still held.
The key rule: use lock_guard for simple, scoped locking where you just need to protect a region. Use unique_lock when you need deferred locking, try-locking, timed locking, early unlock, or interoperability with condition variables.

std::scoped_lock: Locking Multiple Mutexes Safely (C++17)

When you need to hold multiple mutexes at the same time, the order in which you acquire them matters critically. If Thread A locks mutex1 then mutex2, and Thread B locks mutex2 then mutex1, they can deadlock — each waiting for the other to release.

std::scoped_lock (C++17) solves this by acquiring multiple mutexes atomically, using a deadlock-avoidance algorithm.

C++

#include <iostream>
#include <thread>
#include <mutex>
using namespace std;

struct BankAccount {
    string owner;
    double balance;
    mutable mutex mtx;  // mutable: allows locking in const member functions

    BankAccount(string name, double amount)
        : owner(name), balance(amount) {}
};

// UNSAFE: can deadlock if two transfers happen simultaneously in opposite directions
void transferUnsafe(BankAccount& from, BankAccount& to, double amount) {
    lock_guard<mutex> lockFrom(from.mtx);  // Lock from first
    lock_guard<mutex> lockTo(to.mtx);      // Lock to second
    // If another thread does transfer(to, from, ...) simultaneously:
    // Thread A holds from.mtx, waits for to.mtx
    // Thread B holds to.mtx, waits for from.mtx  -> DEADLOCK
    from.balance -= amount;
    to.balance   += amount;
}

// SAFE with scoped_lock: acquires both mutexes atomically, deadlock-free
void transferSafe(BankAccount& from, BankAccount& to, double amount) {
    scoped_lock lock(from.mtx, to.mtx);  // Acquires BOTH atomically — no deadlock
    from.balance -= amount;
    to.balance   += amount;
    cout << from.owner << " -> " << to.owner << ": $" << amount
         << " | Balances: " << from.owner << "=$" << from.balance
         << ", " << to.owner << "=$" << to.balance << endl;
}

int main() {
    BankAccount alice("Alice", 1000.0);
    BankAccount bob("Bob", 500.0);
    BankAccount carol("Carol", 750.0);

    // Simultaneous transfers — safe with scoped_lock
    thread t1([&]() {
        transferSafe(alice, bob,   100.0);
        transferSafe(bob,   carol, 50.0);
    });
    thread t2([&]() {
        transferSafe(bob,   alice, 200.0);  // Opposite direction — would deadlock without scoped_lock
        transferSafe(carol, alice, 75.0);
    });

    t1.join();
    t2.join();

    cout << "\nFinal balances:" << endl;
    cout << "Alice: $" << alice.balance << endl;
    cout << "Bob:   $" << bob.balance   << endl;
    cout << "Carol: $" << carol.balance << endl;

    return 0;
}

#include <iostream>
#include <thread>
#include <mutex>
using namespace std;

struct BankAccount {
    string owner;
    double balance;
    mutable mutex mtx;  // mutable: allows locking in const member functions

    BankAccount(string name, double amount)
        : owner(name), balance(amount) {}
};

// UNSAFE: can deadlock if two transfers happen simultaneously in opposite directions
void transferUnsafe(BankAccount& from, BankAccount& to, double amount) {
    lock_guard<mutex> lockFrom(from.mtx);  // Lock from first
    lock_guard<mutex> lockTo(to.mtx);      // Lock to second
    // If another thread does transfer(to, from, ...) simultaneously:
    // Thread A holds from.mtx, waits for to.mtx
    // Thread B holds to.mtx, waits for from.mtx  -> DEADLOCK
    from.balance -= amount;
    to.balance   += amount;
}

// SAFE with scoped_lock: acquires both mutexes atomically, deadlock-free
void transferSafe(BankAccount& from, BankAccount& to, double amount) {
    scoped_lock lock(from.mtx, to.mtx);  // Acquires BOTH atomically — no deadlock
    from.balance -= amount;
    to.balance   += amount;
    cout << from.owner << " -> " << to.owner << ": $" << amount
         << " | Balances: " << from.owner << "=$" << from.balance
         << ", " << to.owner << "=$" << to.balance << endl;
}

int main() {
    BankAccount alice("Alice", 1000.0);
    BankAccount bob("Bob", 500.0);
    BankAccount carol("Carol", 750.0);

    // Simultaneous transfers — safe with scoped_lock
    thread t1([&]() {
        transferSafe(alice, bob,   100.0);
        transferSafe(bob,   carol, 50.0);
    });
    thread t2([&]() {
        transferSafe(bob,   alice, 200.0);  // Opposite direction — would deadlock without scoped_lock
        transferSafe(carol, alice, 75.0);
    });

    t1.join();
    t2.join();

    cout << "\nFinal balances:" << endl;
    cout << "Alice: $" << alice.balance << endl;
    cout << "Bob:   $" << bob.balance   << endl;
    cout << "Carol: $" << carol.balance << endl;

    return 0;
}

Output:

Plaintext

Alice -> Bob: $100 | Balances: Alice=$900, Bob=$600
Bob -> Alice: $200 | Balances: Bob=$400, Alice=$1100
Bob -> Carol: $50 | Balances: Bob=$350, Carol=$800
Carol -> Alice: $75 | Balances: Carol=$725, Alice=$1175

Final balances:
Alice: $1175
Bob:   $350
Carol: $725

Alice -> Bob: $100 | Balances: Alice=$900, Bob=$600
Bob -> Alice: $200 | Balances: Bob=$400, Alice=$1100
Bob -> Carol: $50 | Balances: Bob=$350, Carol=$800
Carol -> Alice: $75 | Balances: Carol=$725, Alice=$1175

Final balances:
Alice: $1175
Bob:   $350
Carol: $725

Step-by-step explanation:

transferUnsafe locks from.mtx then to.mtx in that order. If Thread A calls transfer(alice, bob, ...) and simultaneously Thread B calls transfer(bob, alice, ...), Thread A holds alice.mtx waiting for bob.mtx, while Thread B holds bob.mtx waiting for alice.mtx — a classic circular deadlock.
scoped_lock lock(from.mtx, to.mtx) uses a deadlock-avoidance algorithm (equivalent to std::lock) to acquire both mutexes without deadlock, regardless of the order in which multiple threads try to acquire them. Internally it may use lock ordering, back-off, or other strategies.
mutable mutex mtx in the struct allows locking even in const member functions. A const reference to a BankAccount would normally not allow modifying mtx, but mutable overrides this — mutex state is conceptually separate from the object’s logical value.
scoped_lock takes any number of lockable objects. scoped_lock lock(m1, m2, m3) acquires all three atomically. It cannot be unlocked early (unlike unique_lock) — use unique_lock with std::lock() if you need that flexibility.
The final balances are consistent: every transfer either fully completes or doesn’t happen — no partial transfers, no negative balances from interleaved operations.

Deadlocks: Causes, Detection, and Prevention

A deadlock occurs when two or more threads are each waiting for a resource held by another, creating a circular dependency that no thread can break.

C++

#include <iostream>
#include <thread>
#include <mutex>
#include <chrono>
using namespace std;

mutex mtxA, mtxB;

// Thread 1: locks A then B
void thread1_work() {
    cout << "T1: trying to lock A" << endl;
    lock_guard<mutex> lockA(mtxA);
    cout << "T1: locked A, sleeping..." << endl;
    this_thread::sleep_for(chrono::milliseconds(100));  // Give T2 time to lock B

    cout << "T1: trying to lock B" << endl;
    lock_guard<mutex> lockB(mtxB);  // Will block — T2 holds B
    cout << "T1: locked B (never reached in deadlock scenario)" << endl;
}

// Thread 2: locks B then A (opposite order — recipe for deadlock)
void thread2_work() {
    cout << "T2: trying to lock B" << endl;
    lock_guard<mutex> lockB(mtxB);
    cout << "T2: locked B, sleeping..." << endl;
    this_thread::sleep_for(chrono::milliseconds(100));

    cout << "T2: trying to lock A" << endl;
    lock_guard<mutex> lockA(mtxA);  // Will block — T1 holds A
    cout << "T2: locked A (never reached in deadlock scenario)" << endl;
}

// Solution: always lock in the same order, or use scoped_lock
void thread1_safe() {
    scoped_lock lock(mtxA, mtxB);  // Always acquires in consistent order
    cout << "T1 safe: locked both" << endl;
}

void thread2_safe() {
    scoped_lock lock(mtxA, mtxB);  // Same order — no deadlock possible
    cout << "T2 safe: locked both" << endl;
}

int main() {
    // Demonstrating deadlock would hang the program — instead show the fix:
    cout << "=== Safe version with scoped_lock ===" << endl;
    thread t1(thread1_safe);
    thread t2(thread2_safe);
    t1.join();
    t2.join();
    cout << "Both threads completed — no deadlock" << endl;

    return 0;
}

#include <iostream>
#include <thread>
#include <mutex>
#include <chrono>
using namespace std;

mutex mtxA, mtxB;

// Thread 1: locks A then B
void thread1_work() {
    cout << "T1: trying to lock A" << endl;
    lock_guard<mutex> lockA(mtxA);
    cout << "T1: locked A, sleeping..." << endl;
    this_thread::sleep_for(chrono::milliseconds(100));  // Give T2 time to lock B

    cout << "T1: trying to lock B" << endl;
    lock_guard<mutex> lockB(mtxB);  // Will block — T2 holds B
    cout << "T1: locked B (never reached in deadlock scenario)" << endl;
}

// Thread 2: locks B then A (opposite order — recipe for deadlock)
void thread2_work() {
    cout << "T2: trying to lock B" << endl;
    lock_guard<mutex> lockB(mtxB);
    cout << "T2: locked B, sleeping..." << endl;
    this_thread::sleep_for(chrono::milliseconds(100));

    cout << "T2: trying to lock A" << endl;
    lock_guard<mutex> lockA(mtxA);  // Will block — T1 holds A
    cout << "T2: locked A (never reached in deadlock scenario)" << endl;
}

// Solution: always lock in the same order, or use scoped_lock
void thread1_safe() {
    scoped_lock lock(mtxA, mtxB);  // Always acquires in consistent order
    cout << "T1 safe: locked both" << endl;
}

void thread2_safe() {
    scoped_lock lock(mtxA, mtxB);  // Same order — no deadlock possible
    cout << "T2 safe: locked both" << endl;
}

int main() {
    // Demonstrating deadlock would hang the program — instead show the fix:
    cout << "=== Safe version with scoped_lock ===" << endl;
    thread t1(thread1_safe);
    thread t2(thread2_safe);
    t1.join();
    t2.join();
    cout << "Both threads completed — no deadlock" << endl;

    return 0;
}

Output:

Plaintext

=== Safe version with scoped_lock ===
T1 safe: locked both
T2 safe: locked both
Both threads completed — no deadlock

=== Safe version with scoped_lock ===
T1 safe: locked both
T2 safe: locked both
Both threads completed — no deadlock

The Four Deadlock Conditions (Coffman Conditions)

A deadlock requires all four of these conditions simultaneously:

Mutual Exclusion — Resources cannot be shared; only one thread can hold a lock at a time. (This is fundamental to mutexes — you generally cannot eliminate it.)

Hold and Wait — A thread holds one resource while waiting for another. (Break this with scoped_lock: acquire all resources at once.)

No Preemption — Resources cannot be forcibly taken from a thread; it must release them voluntarily. (C++ mutexes are non-preemptible by design.)

Circular Wait — There is a circular chain of threads, each waiting for a resource held by the next. (Break this with consistent lock ordering or scoped_lock.)

Practical Deadlock Prevention Rules

C++

// Rule 1: Always acquire multiple mutexes with scoped_lock, never manually
// BAD:
lock_guard<mutex> lk1(mtx1);  // Locks first
lock_guard<mutex> lk2(mtx2);  // Then second — deadlock risk if another thread reverses
// GOOD:
scoped_lock lk(mtx1, mtx2);   // Atomic acquisition — no deadlock

// Rule 2: Never call external functions while holding a lock
mutex m;
void dangerous() {
    lock_guard<mutex> lock(m);
    externalCallback();  // BAD: externalCallback may try to acquire m again
}                        // or acquire another mutex in a different order

// Rule 3: Keep critical sections short — lock, do work, unlock quickly
// BAD: long critical section
{
    lock_guard<mutex> lock(m);
    networkFetch();   // Could block for seconds holding the lock
    processResult();
    updateDatabase();
}
// GOOD: only protect the shared state access
auto data = networkFetch();   // Outside lock
auto result = processResult(data);  // Outside lock
{
    lock_guard<mutex> lock(m);
    sharedState = result;     // Only the actual shared write is locked
}

// Rule 4: Prefer std::atomic for simple counters and flags
// Instead of mutex + int, use:
atomic<int> counter{0};
counter++;  // Atomic, no mutex needed

// Rule 1: Always acquire multiple mutexes with scoped_lock, never manually
// BAD:
lock_guard<mutex> lk1(mtx1);  // Locks first
lock_guard<mutex> lk2(mtx2);  // Then second — deadlock risk if another thread reverses
// GOOD:
scoped_lock lk(mtx1, mtx2);   // Atomic acquisition — no deadlock

// Rule 2: Never call external functions while holding a lock
mutex m;
void dangerous() {
    lock_guard<mutex> lock(m);
    externalCallback();  // BAD: externalCallback may try to acquire m again
}                        // or acquire another mutex in a different order

// Rule 3: Keep critical sections short — lock, do work, unlock quickly
// BAD: long critical section
{
    lock_guard<mutex> lock(m);
    networkFetch();   // Could block for seconds holding the lock
    processResult();
    updateDatabase();
}
// GOOD: only protect the shared state access
auto data = networkFetch();   // Outside lock
auto result = processResult(data);  // Outside lock
{
    lock_guard<mutex> lock(m);
    sharedState = result;     // Only the actual shared write is locked
}

// Rule 4: Prefer std::atomic for simple counters and flags
// Instead of mutex + int, use:
atomic<int> counter{0};
counter++;  // Atomic, no mutex needed

Thread-Safe Class Design

Designing a thread-safe class means protecting all public methods that access shared state, while minimizing lock contention.

C++

#include <iostream>
#include <thread>
#include <mutex>
#include <vector>
#include <stdexcept>
using namespace std;

// A thread-safe queue
template<typename T>
class ThreadSafeQueue {
public:
    // Push: add element — locks briefly
    void push(T value) {
        lock_guard<mutex> lock(mtx_);
        data_.push_back(move(value));
    }

    // Pop: remove and return front element — throws if empty
    T pop() {
        lock_guard<mutex> lock(mtx_);
        if (data_.empty()) throw runtime_error("Queue is empty");
        T value = move(data_.front());
        data_.erase(data_.begin());
        return value;
    }

    // Try-pop: non-throwing version — returns false if empty
    bool tryPop(T& value) {
        lock_guard<mutex> lock(mtx_);
        if (data_.empty()) return false;
        value = move(data_.front());
        data_.erase(data_.begin());
        return true;
    }

    // Size: snapshot — may be stale by the time caller uses it
    size_t size() const {
        lock_guard<mutex> lock(mtx_);
        return data_.size();
    }

    bool empty() const {
        lock_guard<mutex> lock(mtx_);
        return data_.empty();
    }

private:
    mutable mutex    mtx_;
    vector<T>        data_;
};

int main() {
    ThreadSafeQueue<int> queue;

    // Producer thread: push 1000 items
    thread producer([&]() {
        for (int i = 0; i < 1000; i++) {
            queue.push(i);
        }
        cout << "Producer done. Queue size: " << queue.size() << endl;
    });

    // Consumer threads: pop items
    thread consumer1([&]() {
        int count = 0;
        int value;
        while (queue.tryPop(value)) count++;
        cout << "Consumer 1 popped: " << count << endl;
    });

    thread consumer2([&]() {
        int count = 0;
        int value;
        while (queue.tryPop(value)) count++;
        cout << "Consumer 2 popped: " << count << endl;
    });

    producer.join();
    consumer1.join();
    consumer2.join();

    cout << "Remaining in queue: " << queue.size() << endl;

    return 0;
}

#include <iostream>
#include <thread>
#include <mutex>
#include <vector>
#include <stdexcept>
using namespace std;

// A thread-safe queue
template<typename T>
class ThreadSafeQueue {
public:
    // Push: add element — locks briefly
    void push(T value) {
        lock_guard<mutex> lock(mtx_);
        data_.push_back(move(value));
    }

    // Pop: remove and return front element — throws if empty
    T pop() {
        lock_guard<mutex> lock(mtx_);
        if (data_.empty()) throw runtime_error("Queue is empty");
        T value = move(data_.front());
        data_.erase(data_.begin());
        return value;
    }

    // Try-pop: non-throwing version — returns false if empty
    bool tryPop(T& value) {
        lock_guard<mutex> lock(mtx_);
        if (data_.empty()) return false;
        value = move(data_.front());
        data_.erase(data_.begin());
        return true;
    }

    // Size: snapshot — may be stale by the time caller uses it
    size_t size() const {
        lock_guard<mutex> lock(mtx_);
        return data_.size();
    }

    bool empty() const {
        lock_guard<mutex> lock(mtx_);
        return data_.empty();
    }

private:
    mutable mutex    mtx_;
    vector<T>        data_;
};

int main() {
    ThreadSafeQueue<int> queue;

    // Producer thread: push 1000 items
    thread producer([&]() {
        for (int i = 0; i < 1000; i++) {
            queue.push(i);
        }
        cout << "Producer done. Queue size: " << queue.size() << endl;
    });

    // Consumer threads: pop items
    thread consumer1([&]() {
        int count = 0;
        int value;
        while (queue.tryPop(value)) count++;
        cout << "Consumer 1 popped: " << count << endl;
    });

    thread consumer2([&]() {
        int count = 0;
        int value;
        while (queue.tryPop(value)) count++;
        cout << "Consumer 2 popped: " << count << endl;
    });

    producer.join();
    consumer1.join();
    consumer2.join();

    cout << "Remaining in queue: " << queue.size() << endl;

    return 0;
}

Output (exact split between consumers varies):

Plaintext

Producer done. Queue size: 1000
Consumer 1 popped: 634
Consumer 2 popped: 366
Remaining in queue: 0

Producer done. Queue size: 1000
Consumer 1 popped: 634
Consumer 2 popped: 366
Remaining in queue: 0

Step-by-step explanation:

Every public method that accesses data_ acquires the mutex first. No method can run concurrently with another on the same object — all access is serialized through the mutex.
mutable mutex mtx_ allows locking in const methods (size(), empty()). The mutex protects the logical consistency of the object, not its physical state, so it is semantically mutable.
tryPop returns bool rather than throwing on empty. This is the preferred pattern for concurrent queues — by the time a caller checks empty() and then calls pop(), another thread may have emptied the queue. tryPop makes the check-and-remove atomic.
The “TOCTOU” (time-of-check to time-of-use) problem: if (!q.empty()) q.pop() is NOT safe even with an individual mutex inside each method. Between empty() returning false and pop() running, another thread could have emptied the queue. tryPop solves this by doing both operations under one lock acquisition.
The producer and consumers run truly concurrently. The final total of items popped equals 1000 — no items are lost or double-counted because the mutex protects all accesses.

std::shared_mutex: Reader-Writer Locking (C++17)

Many real-world scenarios have many readers and few writers — a configuration store, a cache, a lookup table. Using a regular mutex forces all readers to wait for each other, which is unnecessary. std::shared_mutex allows multiple simultaneous readers while still ensuring exclusive access for writers.

C++

#include <iostream>
#include <thread>
#include <shared_mutex>
#include <map>
#include <string>
#include <chrono>
using namespace std;

// Thread-safe configuration store: many readers, occasional writers
class ConfigStore {
public:
    // Read: multiple threads can read simultaneously
    string get(const string& key) const {
        shared_lock<shared_mutex> lock(mtx_);  // Shared (read) lock
        auto it = store_.find(key);
        return (it != store_.end()) ? it->second : "";
    }

    // Write: exclusive access — all readers and other writers must wait
    void set(const string& key, const string& value) {
        unique_lock<shared_mutex> lock(mtx_);  // Exclusive (write) lock
        store_[key] = value;
        cout << "  [SET] " << key << " = " << value << endl;
    }

    size_t size() const {
        shared_lock<shared_mutex> lock(mtx_);
        return store_.size();
    }

private:
    mutable shared_mutex         mtx_;
    map<string, string>          store_;
};

int main() {
    ConfigStore config;

    // Initial setup (single-threaded — no contention)
    config.set("host",    "localhost");
    config.set("port",    "8080");
    config.set("timeout", "30");

    cout << "\nStarting concurrent read/write test..." << endl;

    // Multiple reader threads
    auto reader = [&](int id) {
        for (int i = 0; i < 5; i++) {
            string host    = config.get("host");
            string port    = config.get("port");
            string timeout = config.get("timeout");
            cout << "Reader " << id << ": host=" << host
                 << " port=" << port
                 << " timeout=" << timeout << endl;
            this_thread::sleep_for(chrono::milliseconds(10));
        }
    };

    // Writer thread: updates config occasionally
    auto writer = [&]() {
        this_thread::sleep_for(chrono::milliseconds(25));
        config.set("timeout", "60");   // Update timeout
        this_thread::sleep_for(chrono::milliseconds(30));
        config.set("host", "prod.example.com");  // Switch host
    };

    thread r1(reader, 1), r2(reader, 2), r3(reader, 3);
    thread w(writer);

    r1.join(); r2.join(); r3.join();
    w.join();

    cout << "\nFinal config:" << endl;
    cout << "host    = " << config.get("host")    << endl;
    cout << "port    = " << config.get("port")    << endl;
    cout << "timeout = " << config.get("timeout") << endl;

    return 0;
}

#include <iostream>
#include <thread>
#include <shared_mutex>
#include <map>
#include <string>
#include <chrono>
using namespace std;

// Thread-safe configuration store: many readers, occasional writers
class ConfigStore {
public:
    // Read: multiple threads can read simultaneously
    string get(const string& key) const {
        shared_lock<shared_mutex> lock(mtx_);  // Shared (read) lock
        auto it = store_.find(key);
        return (it != store_.end()) ? it->second : "";
    }

    // Write: exclusive access — all readers and other writers must wait
    void set(const string& key, const string& value) {
        unique_lock<shared_mutex> lock(mtx_);  // Exclusive (write) lock
        store_[key] = value;
        cout << "  [SET] " << key << " = " << value << endl;
    }

    size_t size() const {
        shared_lock<shared_mutex> lock(mtx_);
        return store_.size();
    }

private:
    mutable shared_mutex         mtx_;
    map<string, string>          store_;
};

int main() {
    ConfigStore config;

    // Initial setup (single-threaded — no contention)
    config.set("host",    "localhost");
    config.set("port",    "8080");
    config.set("timeout", "30");

    cout << "\nStarting concurrent read/write test..." << endl;

    // Multiple reader threads
    auto reader = [&](int id) {
        for (int i = 0; i < 5; i++) {
            string host    = config.get("host");
            string port    = config.get("port");
            string timeout = config.get("timeout");
            cout << "Reader " << id << ": host=" << host
                 << " port=" << port
                 << " timeout=" << timeout << endl;
            this_thread::sleep_for(chrono::milliseconds(10));
        }
    };

    // Writer thread: updates config occasionally
    auto writer = [&]() {
        this_thread::sleep_for(chrono::milliseconds(25));
        config.set("timeout", "60");   // Update timeout
        this_thread::sleep_for(chrono::milliseconds(30));
        config.set("host", "prod.example.com");  // Switch host
    };

    thread r1(reader, 1), r2(reader, 2), r3(reader, 3);
    thread w(writer);

    r1.join(); r2.join(); r3.join();
    w.join();

    cout << "\nFinal config:" << endl;
    cout << "host    = " << config.get("host")    << endl;
    cout << "port    = " << config.get("port")    << endl;
    cout << "timeout = " << config.get("timeout") << endl;

    return 0;
}

Output (interleaving varies):

Plaintext

  [SET] host = localhost
  [SET] port = 8080
  [SET] timeout = 30

Starting concurrent read/write test...
Reader 1: host=localhost port=8080 timeout=30
Reader 2: host=localhost port=8080 timeout=30
Reader 3: host=localhost port=8080 timeout=30
  [SET] timeout = 60
Reader 1: host=localhost port=8080 timeout=60
Reader 2: host=localhost port=8080 timeout=60
Reader 3: host=localhost port=8080 timeout=60
  [SET] host = prod.example.com
Reader 1: host=prod.example.com port=8080 timeout=60
Reader 2: host=prod.example.com port=8080 timeout=60
Reader 3: host=prod.example.com port=8080 timeout=60

Final config:
host    = prod.example.com
port    = 8080
timeout = 60

  [SET] host = localhost
  [SET] port = 8080
  [SET] timeout = 30

Starting concurrent read/write test...
Reader 1: host=localhost port=8080 timeout=30
Reader 2: host=localhost port=8080 timeout=30
Reader 3: host=localhost port=8080 timeout=30
  [SET] timeout = 60
Reader 1: host=localhost port=8080 timeout=60
Reader 2: host=localhost port=8080 timeout=60
Reader 3: host=localhost port=8080 timeout=60
  [SET] host = prod.example.com
Reader 1: host=prod.example.com port=8080 timeout=60
Reader 2: host=prod.example.com port=8080 timeout=60
Reader 3: host=prod.example.com port=8080 timeout=60

Final config:
host    = prod.example.com
port    = 8080
timeout = 60

Step-by-step explanation:

shared_lock<shared_mutex> lock(mtx_) acquires the mutex in shared mode — multiple threads can hold shared locks simultaneously. This allows all three reader threads to call get() at the same time without blocking each other.
unique_lock<shared_mutex> lock(mtx_) acquires the mutex in exclusive mode — no other thread (reader or writer) can hold any lock on mtx_ at the same time. The writer must wait for all active readers to release their shared locks before proceeding.
While a writer holds the exclusive lock, all new shared_lock attempts block. Once the writer releases, readers can again proceed concurrently.
This is the classic reader-writer pattern. It provides significantly better throughput than a regular mutex in read-heavy workloads because reads don’t block each other — only writes do.
All three readers consistently see the same timeout value (either 30 or 60) — never a partial update. The shared_mutex ensures that the write of "60" happens atomically from the readers’ perspective.

once_flag and call_once: Thread-Safe Initialization

For one-time initialization — initializing a singleton, loading a configuration file, creating a resource that all threads will share — std::call_once with std::once_flag guarantees the initialization runs exactly once, safely, even when multiple threads race to initialize simultaneously.

C++

#include <iostream>
#include <thread>
#include <mutex>
#include <memory>
using namespace std;

class ExpensiveResource {
public:
    ExpensiveResource() {
        cout << "  ExpensiveResource created (this should print once)" << endl;
        // Simulate expensive initialization
        this_thread::sleep_for(chrono::milliseconds(100));
    }
    void use() { cout << "  Using resource" << endl; }
};

// Singleton-style lazy initialization — thread-safe with call_once
class ResourceManager {
public:
    static ExpensiveResource& getResource() {
        call_once(initFlag_, []() {
            resource_ = make_unique<ExpensiveResource>();
        });
        return *resource_;
    }

private:
    static once_flag                     initFlag_;
    static unique_ptr<ExpensiveResource> resource_;
};

once_flag                     ResourceManager::initFlag_;
unique_ptr<ExpensiveResource> ResourceManager::resource_;

int main() {
    cout << "Launching 5 threads that all request the resource..." << endl;

    vector<thread> threads;
    for (int i = 0; i < 5; i++) {
        threads.emplace_back([i]() {
            auto& res = ResourceManager::getResource();
            cout << "Thread " << i << " got resource" << endl;
            res.use();
        });
    }

    for (auto& t : threads) t.join();

    cout << "\nAll threads done. Resource was initialized exactly once." << endl;

    return 0;
}

#include <iostream>
#include <thread>
#include <mutex>
#include <memory>
using namespace std;

class ExpensiveResource {
public:
    ExpensiveResource() {
        cout << "  ExpensiveResource created (this should print once)" << endl;
        // Simulate expensive initialization
        this_thread::sleep_for(chrono::milliseconds(100));
    }
    void use() { cout << "  Using resource" << endl; }
};

// Singleton-style lazy initialization — thread-safe with call_once
class ResourceManager {
public:
    static ExpensiveResource& getResource() {
        call_once(initFlag_, []() {
            resource_ = make_unique<ExpensiveResource>();
        });
        return *resource_;
    }

private:
    static once_flag                     initFlag_;
    static unique_ptr<ExpensiveResource> resource_;
};

once_flag                     ResourceManager::initFlag_;
unique_ptr<ExpensiveResource> ResourceManager::resource_;

int main() {
    cout << "Launching 5 threads that all request the resource..." << endl;

    vector<thread> threads;
    for (int i = 0; i < 5; i++) {
        threads.emplace_back([i]() {
            auto& res = ResourceManager::getResource();
            cout << "Thread " << i << " got resource" << endl;
            res.use();
        });
    }

    for (auto& t : threads) t.join();

    cout << "\nAll threads done. Resource was initialized exactly once." << endl;

    return 0;
}

Output:

Plaintext

Launching 5 threads that all request the resource...
  ExpensiveResource created (this should print once)
Thread 0 got resource
  Using resource
Thread 1 got resource
  Using resource
Thread 2 got resource
  Using resource
Thread 3 got resource
  Using resource
Thread 4 got resource
  Using resource

All threads done. Resource was initialized exactly once.

Launching 5 threads that all request the resource...
  ExpensiveResource created (this should print once)
Thread 0 got resource
  Using resource
Thread 1 got resource
  Using resource
Thread 2 got resource
  Using resource
Thread 3 got resource
  Using resource
Thread 4 got resource
  Using resource

All threads done. Resource was initialized exactly once.

Step-by-step explanation:

call_once(initFlag_, lambda) ensures the lambda runs exactly once. If five threads call call_once simultaneously, one gets to execute the lambda, and the other four block until it completes, then all proceed knowing the initialization is done.
once_flag initFlag_ is the token that tracks whether the initialization has occurred. It must be accessible to all threads (here, as a static member). Never copy or move a once_flag.
This pattern is safe, correct, and efficient. Double-checked locking (the traditional singleton pattern) is notoriously difficult to implement correctly due to memory ordering issues — call_once handles all of that automatically.
Note that even C++11’s static local variable initialization is thread-safe: static ExpensiveResource resource; inside a function is guaranteed to be initialized only once, even with concurrent calls. For more complex initialization scenarios, call_once is the right tool.

Synchronization Primitives Quick Reference

Type	Header	Purpose	Blocking?	Movable?
`mutex`	`<mutex>`	Basic mutual exclusion	Yes	No
`timed_mutex`	`<mutex>`	Mutex with timeout support	Yes (timed)	No
`recursive_mutex`	`<mutex>`	Re-entrant from same thread	Yes	No
`shared_mutex`	`<shared_mutex>`	Multiple readers OR one writer	Yes	No
`lock_guard<M>`	`<mutex>`	Simple RAII scoped lock	N/A	No
`unique_lock<M>`	`<mutex>`	Flexible RAII lock	N/A	Yes
`shared_lock<M>`	`<shared_mutex>`	Shared (read) RAII lock	N/A	Yes
`scoped_lock<Ms...>`	`<mutex>`	Lock multiple mutexes atomically	N/A	No
`once_flag` + `call_once`	`<mutex>`	One-time initialization	Yes (until done)	No

Common Mistakes and How to Avoid Them

Mistake 1: Forgetting to protect all accesses to shared data. Every access — read or write — to shared mutable data must be protected. A race between a writer and a reader is just as dangerous as two writers.

C++

mutex m;
int shared = 0;
// Thread A:
{ lock_guard<mutex> lock(m); shared = 42; }   // WRITE protected
// Thread B:
cout << shared;  // BAD: READ not protected — data race!
cout << [&]() { lock_guard<mutex> lock(m); return shared; }();  // GOOD

mutex m;
int shared = 0;
// Thread A:
{ lock_guard<mutex> lock(m); shared = 42; }   // WRITE protected
// Thread B:
cout << shared;  // BAD: READ not protected — data race!
cout << [&]() { lock_guard<mutex> lock(m); return shared; }();  // GOOD

Mistake 2: Locking the wrong mutex. Each piece of shared data must be protected by its own dedicated mutex. Using the wrong mutex (or no mutex) for a piece of data provides no protection.

Mistake 3: Holding a lock while calling external or unknown code. External callbacks may try to acquire the same mutex (lock inversion), or may take a very long time, increasing contention.

Mistake 4: Using size() for control flow on a thread-safe container.

C++

if (!tsQueue.empty()) {        // Snapshot — may be stale
    auto val = tsQueue.pop();  // Another thread may have emptied it!
}
// Fix: use try-pop that atomically checks and pops

if (!tsQueue.empty()) {        // Snapshot — may be stale
    auto val = tsQueue.pop();  // Another thread may have emptied it!
}
// Fix: use try-pop that atomically checks and pops

Mistake 5: Recursive locking with a non-recursive mutex.

C++

mutex m;
void outer() {
    lock_guard<mutex> lock(m);
    inner();  // If inner() also tries to lock m — DEADLOCK!
}
void inner() {
    lock_guard<mutex> lock(m);  // Tries to lock m, already held by same thread
    // ...
}
// Fix: use recursive_mutex, or restructure to avoid recursive locking

mutex m;
void outer() {
    lock_guard<mutex> lock(m);
    inner();  // If inner() also tries to lock m — DEADLOCK!
}
void inner() {
    lock_guard<mutex> lock(m);  // Tries to lock m, already held by same thread
    // ...
}
// Fix: use recursive_mutex, or restructure to avoid recursive locking

Conclusion

Mutexes and lock wrappers are the cornerstone of safe multithreaded C++ programming. The mutex provides the primitive — mutual exclusion — that prevents data races by ensuring only one thread at a time can access shared state. RAII lock wrappers — lock_guard, unique_lock, shared_lock, and scoped_lock — ensure that mutexes are always released correctly, regardless of exceptions or early returns, mirroring the same philosophy as unique_ptr for memory management.

lock_guard is the right choice for simple, scoped locking where you want to protect a block of code. unique_lock adds flexibility for deferred locking, try-locking, early unlock, and condition variables. scoped_lock is essential when you must acquire multiple mutexes without risking deadlock. shared_mutex with shared_lock dramatically improves throughput for read-heavy workloads. And call_once with once_flag provides clean, safe one-time initialization.

Deadlocks — the silent killer of multithreaded programs — are prevented by three simple rules: always acquire multiple mutexes with scoped_lock, never call external functions while holding a lock, and keep critical sections short.

With these tools mastered, you can write thread-safe data structures, concurrent algorithms, and parallel systems that are both correct and performant — harnessing the power of multi-core hardware without the nightmare of non-deterministic bugs.