Tag Archives: threads

C++11 concurrency: locks revisited

In a previous post about locks in C++11 I have shown a dummy implementation of a container class that looked like this (simplified):

One can argue that the dump() method does not alter the state of the container and should be (logically) const. However, as soon as you make it const you get the following error:

‘std::lock_guard<_Mutex>::lock_guard(_Mutex &)’ : cannot convert parameter 1 from ‘const std::recursive_mutex’ to ‘std::recursive_mutex &’

The mutex (regardless which of the four flavors available in C++11) must be acquired and released and the lock() and unlock() operations are not constant. So the argument the lock_guard takes cannot be logically const, as it would be if the method was const.

The solution to this problem is to make the mutex mutable. Mutable allows changing state from const functions. It should however be used only for hidden or “meta” state (imagine caching computed or looked-up data so a next call can complete immediately, or altering bits like a mutex that only complement the actual state of an object).

An important thing to note is that in C++11 both const and mutable imply thread-safety. I recommend this C++ and Beyond talk by Herb Sutter called You don’t know [blank] and [blank].

C++11 concurrency: condition variables

In the previous post in this series we have seen the C++11 support for locks and in this post we continue on this topic with condition variables. A condition variable is a synchronization primitive that enables blocking of one or more threads until either a notification is received from another thread or a timeout or a spurious wake-up occurs.

There are two implementations of a condition variable that are provided by C++11:

  • condition_variable: requires any thread that wants to wait on it to acquire a std::unique_lock first.
  • condition_variable_any: is a more general implementation that works with any type that satisfies the condition of a basic lock (basically has a lock() and unlock() method). This might be more expensive to use (in terms of performance and operating system resources), therefore it should be preferred only if the additional flexibility it provides is necessary.

So how does a condition variable work?

  • There must be at least one thread that is waiting for a condition to become true. The waiting thread must first acquire a unique_lock. This lock is passed to the wait() method, that releases the mutex and suspends the thread until the condition variable is signaled. When that happens the thread is awaken and the lock is re-acquired.
  • There must be at least one thread that is signaling that a condition becomes true. The signaling can be done with notify_one() which unblocks one thread (any) that is waiting for the condition to be signaled or with notify_all which unblocks all the threads that are waiting for the condition.
  • Because of some complications in making the condition wake-up completely predictable on multiprocessor systems, spurious wake-ups can occur. That means a thread is awaken even if nobody signaled the condition variable. Therefore it is necessary to check if the condition is still true after the thread has awaken. And since spurious wake-ups can occur multiple times, that check must be done in a loop.

The code below shows an example of using a condition variable to synchronize threads: several “worker” threads may produce an error during their work and they put the error code in a queue. A “logger” thread processes these error codes, by getting them from the queue and printing them. The workers signal the logger when an error occurred. The logger is waiting on the condition variable to be signaled. To avoid spurious wakeups the wait happens in a loop where a boolean condition is checked.

Running this code produce an output that looks like this (notice this output is different with each run because each worker thread works, i.e. sleeps, for a random interval):

The wait() method seen above has two overloads:

  • one that only takes a unique_lock; this one releases the lock, blocks the thread and adds it to the queue of threads that are waiting on this condition variable; the thread wakes up when the the condition variable is signaled or when a spurious wakeup occurs. When any of those happen, the lock is reacquired and the function returns.
  • one that in addition to the unique_lock also takes a predicate that is used to loop until it returns false; this overload may be used to avoid spurious wakeups. It is basically equivalent to:

As a result the use of the boolean flag g_notified in the example above can be avoided by using the wait overload that takes a predicate that verifies the state of the queue (empty or not):

In addition to this wait() overloaded method there are two more waiting methods, both with similar overloads that take a predicate to avoid spurious wake-ups:

  • wait_for: blocks the thread until the condition variable is signaled or the specified timeout occurred.
  • wait_until: blocks the thread until the condition variable is signaled or the specified moment in time was reached.

The overload without a predicate of these two functions returns a cv_status that indicates whether a timeout occurred or the wake-up happened because the condition variable was signaled or because of a spurious wake-up.

The standard also provides a function called notified_all_at_thread_exit that implements a mechanism to notify other threads that a given thread has finished, including destroying all thread_local objects. This was introduced because waiting on threads with other mechanisms than join() could lead to incorrect and fatal behavior when thread_locals were used, since their destructors could have been called even after the waiting thread resumed and possible also finished (see N3070 and N2880 for more). Typically, a call to this function must happen just before the thread exists.

Below is an example of how notify_all_at_thread_exit can be used together with a condition_variable to synchronize two threads:

That would output either (if the worker finishes work before main)

or (if the main finishes work before the worker):

C++11 concurrency: locks

In a previous post I introduced the C++11 support for threads. In this article I will discuss the locking features provided by the standard that one can use to synchronize access to shared resources.

The core syncing primitive is the mutex, which comes in four flavors, in the <mutex> header:

  • mutex: provides the core lock() and unlock() and the non-blocking try_lock() method that returns if the mutex is not available.
  • recursive_mutex: allows multiple acquisitions of the mutex from the same thread.
  • timed_mutex: similar to mutex, but it comes with two more methods try_lock_for() and try_lock_until() that try to acquire the mutex for a period of time or until a moment in time is reached.
  • recursive_timed_mutex: is a combination of timed_mutex and recusive_mutex.

Here is a simple example of using a mutex to sync the access to the std::cout shared object.

In the next example we’re creating a simple thread-safe container (that just uses std::vector internally) that has methods like add() and addrange(), with the later implemented by calling the first.

Running this program results in a deadlock.

The reason for the deadlock is that the tread that own the mutex cannot re-acquire the mutex, and such an attempt results in a deadlock. That’s were recursive_mutex come into picture. It allows a thread to acquire the same mutext multiple times. The maximum number of times is not specified, but if that number is reached, calling lock would throw a std::system_error. Therefore to fix this implementation (apart from changing the implementation of addrange not to call lock and unlock) is to replace the mutex with a recursive_mutex.

Then the output looks something like this:

Notice the same numbers are generated in each call to func(). That is because the seed is thread local, and the call to srand() only initializes the seed from the main thread. In the other worker threads it doesn’t get initialized, and therefore you get the same numbers every time.

Explicit locking and unlocking can lead to problems, such as forgetting to unlock or incorrect order of locks acquiring that can generate deadlocks. The standard provides several classes and functions to help with this problems.

The wrapper classes allow consistent use of the mutexes in a RAII-style with auto locking and unlocking within the scope of a block. These wrappers are:

  • lock_guard: when the object is constructed it attempts to acquire ownership of the mutex (by calling lock()) and when the object is destructed it automatically releases the mutex (by calling unlock()). This is a non-copyable class.
  • unique_lock: is a general purpose mutex wrapper that unlike lock_quard also provides support for deferred locking, time locking, recursive locking, transfer of lock ownership and use of condition variables. This is also a non-copyable class, but it is moveable.

With these wrappers we can rewrite the container class like this:

Notice that attempting to call try_lock_for() or try_lock_until() on a unique_lock that wraps a non-timed mutex results in a compiling error.

The constructors of these wrapper guards have overloads that take an argument indicating the locking strategy. The available strategies are:

  • defer_lock of type defer_lock_t: do not acquire ownership of the mutex
  • try_to_lock of type try_to_lock_t: try to acquire ownership of the mutex without blocking
  • adopt_lock of type adopt_lock_t: assume the calling thread already has ownership of the mutex

These strategies are declared like this:

Apart from these wrappers for mutexes, the standard also provides a couple of methods for locking one or more mutexes.

  • lock: locks the mutexes using a deadlock avoiding algorithm (by using calls to lock(), try_lock() and unlock()).
  • try_lock: tries to call the mutexes by calling try_lock() in the order of which mutexes were specified.

Here is an example of a deadlock case: we have a container of elements and we have a function exchange() that swaps one element from a container into the other container. To be thread-safe, this function synchronizes the access to the two containers, by acquiring a mutex associated with each container.

Suppose this function is called from two different threads, from the first, an element is removed from container 1 and added to container 2, and in the second it is removed from container 2 and added to container 1. This can lead to a deadblock (if the thread context switches from one thread to another just after acquiring the first lock).

To fix the problem, you can use std::lock that guaranties the locks are acquired in a deadlock-free way:

Hopefully this walktrough will help you understand the basics of the synchronization functionality supported in C++11.

C++11 concurrency: threads

C++11 provides richer support for concurrency than the previous standard. Among the new features is the std::thread class (from the <thread> header) that represents a single thread of execution. Unlike other APIs for creating threads, such as CreateThread, std::thread can work with (regular) functions, lambdas or functors (i.e. classes implementing operator()) and allows you to pass any number of parameters to the thread function.

Let’s see a simple example.

In this example t is a thread object representing the thread under which function func() runs. The call to join blocks the calling thread (in this case the main thread) until the joined thread finishes execution.

If the thread function returns a value, it is ignored. However, the function can take any number of parameters.

The output is:

It is important to note that the parameters to the thread function are passed by value. If you need to pass references you need to use std::ref or std::cref.
The following program prints 42.

But if we change to t(func, std::ref(a)) it prints 43.

In the next example we execute a lambda on a second thread. The lambda doesn’t do much, except for printing some message and sleeping for a while.

The output looks like this (obviously the id of the threads differ for each run).

But if we start two threads, the the output looks different:

The reason is the function running on a separate thread is using std::cout, which is an object representing a stream. This is a shared object of a class that is not thread-safe, therefore the access from different threads to it must be synchronized. There are different mechanisms provided by C++11 to do that (will discuss them in a later post), but one of them is std::mutex (notice there are four flavors of mutexes).

The correct, synchronized code should look like this:

In this examples I have used several functions from the std::this_thread namespace (also defined in the <thread> header). The helper functions from this namespace are:

  • get_id: returns the id of the current thread
  • yield: tells the scheduler to run other threads and can be used when you are in a busy waiting state
  • sleep_for: blocks the execution of the current thread for at least the specified period
  • sleep_util: blocks the execution of the current thread until the specified moment of time has been reached

Apart from join() the thread class provides two more operations:

  • swap: exchanges the underlying handles of two thread objects
  • detach: allows a thread of execution to continue independently of the thread object. Detached threads are no longer joinable (you cannot wait for them).

What happens though if a function running on a separate thread throws an exception? You cannot actually catch that exception in the thread that’s waiting for the faulty thread, because std::terminate is called and this aborts the program.

If func() throws an exception, the following catch block won’t be reached.

What can be done then? To propagate exceptions between threads you could catch them in thread function and store them in a place where they can be lately looked-up. Possible solutions are detailed here and here.

In future posts we will look at other concurrency features from C++11 (such as synchronization mechanisms or tasks).