Last week I’ve encountered some confusion around the way atomic work in C++. Some developers believe wrapping any data structure with atomic will make it, by some magic, thread safe. They are then perplexed by the source not compiling or even worse compiling but not behaving as expected.
Reading the fine print in the C++ standard [ISO/IEC 14882:2014(E) Ch. 25.5 ] reveals that the T argument in the atomic template is required to be trivially copyable . What does it mean? a type is trivially copyable if its creation or destruction doesn’t involve more than allocation or releasing memory – similar to C’s old style malloc. The standard of course is more specific you can find the full list of requirements in [ISO/IEC 14882:2014(E) Ch. 9 ]
So all primitives such as int, char array types are trivially copyable. Since their creation requires no more than allocation space for them in memory. These types can be copied safely using memcpy . The standard library provides a handy method called std::is_trivially_copyable to determine if a type is trivially copyable:
int trivially copyable ? 1
std::array trivially copyable ? 1
std::string trivially copyable ? 0
Coping memory in a thread safe way is what lays at the core of atomic thread safety. This explains the requirement for trivially copyable types. In most implementations, atomic containing primitive types maintain thread safety without using a synchronization object (they are lock free). However, with none-primitive types, the compiler will employ a lock to ensure the modifying thread has exclusive access the memory used by the atomic variable. The method std::atomic::is_lock_free can determine which strategy the compiler selected for a type.
Atomic also control the type of order threads can “observe” changes to memory. The different options are enumerated by std::memory_order. This an advanced topic. In most cases, the default used the by the standard library std::memory_order_seq_cst is the safest way to go.
Let’s talk about what std::memory_order_seq_cst means:
In modern PCs, there are often more than one CPU (cores) each one of the cores had its own memory cache. Values taken from memory can be stored temporarily in the cache memory for quick retrieval. So different threads running on different cores might not observe changes to memory at the same time. It could be that a memory location had changed but a thread still has an older cached value. It is, therefore, not guaranteed that all threads have the same view of memory. This is where std::memory_order_seq_cst steps in. Atomic load/store operation executing with this option ensures that all threads will have the same view of memory, always and for all atomic using this option. So if a store/load operation happened first for one thread, this operation will also happen first from the point of view of other threads.
As always nothing comes for free, in this case, the cost are lost optimization opportunities for the compiler and CPU.
Consider this example (taken from here)
The claim is that the assert in line 45 will always pass. Let’s follow this through :
Either of the 4 threads can execute first. The while loops at lines 22,31 will cause the threads running the read_x_then_y() and read_y_then_x() methods to wait until either x or y are assigned a value. Let’s assume x is updated first:
the read_x_then_y thread exits the while loop and moves to check the value of y.
the read_y_then_x thread is still looping, waiting for y to get updated. By the order imposed by std::memory_order_seq_cst, this thread has the same view of memory as the other threads. So if the read_x_then_y thread does not detect a change in y neither will read_y_then_x . So, y value in line 24 evaluate to false , the z variable is not increased and the thread is done.
Eventually, y get updated by the write_y thread.
read_y_then_x thread exits the loop at line 31 and moves to the next statement where it examines the value of x. As before the std::memory_order_seq_cst total (across threads) order guarantees that the read_y_then_x thread will have the same view of memory as read_x_then_y, Since that thread already seen an updated x. read_y_then_x will also see an x with a true value. So the if statement on line 33 evaluate to true and z value increases satisfying the assert on line 45.
The same chain of arguments works the same if you swap x and y. So in all cases, the assert will not trigger an error.
Note: We could drop the std::memory_order_seq_cst argument in the load and store methods as they are defined as:
The fundamental principals of atomics are not that different from using synchronization objects to manage shared memory access from multiple threads. Having said that, using atomics let you take advantage of the highly optimized expertly crafted code written for the standard library. In addition, it makes the code more readable by hiding most of the thread safe code inside the atomic object.
If you like to know about the other memory order models check these pages:
GCC Wiki – Memory model synchronization modes: a well written friendly explanation with simple examples.
cppreference.com – std::memory_order : an in-depth explanation of the different memory access options with full examples.