C++ Concurrent Programming (CH03) [Protect Shared Data with mutex-01&02]

Problems with sharing data between threads

  1. If multiple threads read data concurrently, this will not cause competition.

Race conditions

  1. Some write contests do not cause serious problems. For example, when push ing objects in the same queue, it doesn't really matter who comes first and who comes next.
  2. However, the competition to modify the list is very serious. This can lead to ambiguous behavior.

Avoiding problematic race conditions

  1. Locking mechanism.

  2. Lockless programming. Very advanced and complex methods.

  3. Using software transactional
    memory, which is how logs are recorded for operations.Then either execute at once or not.This book does not contain word parts, such as A and B being modified at the same time, different copies.However, A submits first.Then B reads the log file.Then in a later operation B will be executed again based on the modification of A)

Protecting shared data with mutexes

Using mutexes in C++

It's not good to use mutex's lock and unlock directly, because you are likely to forget unlock. So it's best to use std::lock_guard, which automatically locks and unlocks when exiting a function.

#include <list>
#include <mutex>
#include <algorithm>
std::list<int> some_list;                       
std::mutex some_mutex;                          
void add_to_list(int new_value)
{
  std::lock_guard<std::mutex> guard(some_mutex);
  some_list.push_back(new_value);
}
bool list_contains(int value_to_find)
{
  std::lock_guard<std::mutex> guard(some_mutex);
  return std::find(some_list.begin(),some_list.end(),value_to_find)
    != some_list.end();
}

C++17 provides the class template argument deduction. This allows the template to be written like this.

std::lock_guard guard(some_mutex);

The code above is just an example. It's best not to put data and mutex in the global s variable area. You should hold them in class es. Isolate data.

Do not expose protected shared variables to external use as function return values.
or pointer).This breaks the protective field of competing resources.

Structuring code for protecting shared data

Neither pass shared protection data out for external use nor pass them into functions that you do not trust and let them handle them.Because they may save pointers and references to these data.

class some_data
{
  int a;
  std::string b;
public:
  void do_something();
};
class data_wrapper
{
private:
  some_data data;
  std::mutex m;
public:
  template<typename Function>
  void process_data(Function func)
  {
    std::lock_guard<std::mutex> l(m);
    func(data);                       //<--Pass "protected" data to user-supplied function
  }
};
some_data* unprotected;
void malicious_function(some_data& protected_data)
{
  unprotected=&protected_data;
}
data_wrapper x;
void foo()
{
  x.process_data(malicious_function); //<--Pass in a malicious function
  unprotected->do_something();        //<--Unprotected access to protected data
}

Programming guides keep in mind that Don't pass pointers and references to protected data
outside the scope of the lock, whether by returning them from a
function, storing them in externally visible memory, or passing them as
arguments to user-supplied functions)

Spotting race conditions inherent in interfaces

  1. Note that if you don't lock properly, competition will also occur. For example, a double-linked list deletion operation must lock three nodes before it can.
  2. Even if you lock the whole structure, there will be problems, adding data structures to the design and returning references. Problems will also exist. This is an error of the data structure designer. In this design, even unlocked programming can lead to errors.

He has some problems with the following stack implementation.

template<typename T,typename Container=std::deque<T> >
class stack
{
public:
  explicit stack(const Container&);
  explicit stack(Container&& = Container());
  template <class Alloc> explicit stack(const Alloc&);
  template <class Alloc> stack(const Container&, const Alloc&);
  template <class Alloc> stack(Container&&, const Alloc&);
  template <class Alloc> stack(stack&&, const Alloc&);
  bool empty() const;
  size_t size() const;
  T& top();
  T const& top() const;
  void push(T const&);
  void push(T&&);
  void pop();
  void swap(stack&&);
};
  1. Where empty() and
    size() can only be guaranteed to be correct at the moment the call returns. Once returned, other threads may go back to add new elements or delete elements. This is a feature of multithreaded programming. Of course, if it is single-threaded, these problems will not exist.

    stack<int> s;
    if(!s.empty())
    {
        int const value=s.top(); // Calling top on an empty stack is undefined behavior. If you are single-threaded and inside an empty stack, calling top(). You will cause segmentation fault.
        s.pop();
        do_something(value);
    }
    //(sai: There will be a problem if this code is executed concurrently)
    

    How to solve?

    • Redesign stack functions. Throw exceptions in illegal situations. This design requires the user to catch exceptions.

    • The code above is problematic in the following multithreaded case.

      Thread A                                   Thread B
      if(!s.empty())                   
                                              if(!s.empty())
      
         int const value=s.top();         
                                                 int const value=s.top();
      
         s.pop();                         
         do_something(value);                     s.pop();
                                                  do_something(value);
      

      Function do_something
      The value of the operation is the same value. Whether it is the correct behavior of the program depends on
      do_something behaves correctly.

  2. stack implements a top and a pop because if pop performs return and delete operations.If a top element is large, then when a copy is made, an exception occurs.This results in data loss because pop performs two operations, return value and delete value.So implementing top and pop to protect data isn't so easy to lose.

  3. It's this top
    The detached pop design causes the problems mentioned above. Although you can overcome them in the following ways, there is a cost involved.

Solution

  1. OPTION 1: PASS IN A REFERENCE

    Thread-safe pop with pop return value and pop-up object. References can be passed as parameters.

    std::vector<int> result;
    some_stack.pop(result);
    

    Disadvantages:

    1. You must declare a variable result in advance before referencing an object, such as result.
    2. In addition, building temporary objects can be time consuming.
    3. Some parameters may be required to build temporary objects, but they are already difficult to access.
    4. It also requires that the object be assignable. Some user types may not allow assignment.
  2. OPTION 2: REQUIRE A NO-THROW COPY CONSTRUCTOR OR MOVE CONSTRUCTOR

    Define copy constructors and copy constructors that do not throw exceptions

    Be careful:

    1. You can use std::is_nothrow_copy_constructible and
      std::is_nothrow_move_constructible
      It is not an exception to determine whether a class contains copy constructors and move constructors.
    2. The main disadvantage is that many classes do not actually have these constructors.
  3. OPTION 3: RETURN A POINTER TO THE POPPED ITEM

    Thread-safe pop that uses pointers to pass data.

    Disadvantages:

    Dynamically exploiting memory at run time is inefficient.

  4. OPTION 4: PROVIDE BOTH OPTION 1 AND EITHER OPTION 2 OR 3

    Thread-safe pop, which provides users with three options)

  5. EXAMPLE DEFINITION OF A THREAD-SAFE STACK

    (std::shared_ptr<>Destruct when the last reference count disappears

    #include <exception>
    #include <memory>                                                //<--For std::shared_ptr<>
    struct empty_stack: std::exception
    {
      const char* what() const throw();
    };
    template<typename T>
    class threadsafe_stack
    {
    public:
      threadsafe_stack();
      threadsafe_stack(const threadsafe_stack&);
      threadsafe_stack& operator=(const threadsafe_stack&) = delete; //<--1 Assignment operator is deleted
      void push(T new_value);
      std::shared_ptr<T> pop();
      void pop(T& value);
      bool empty() const;
    };
    

    (Thread-safe stack implementation)

    #include <exception>
    #include <memory>
    #include <mutex>
    #include <stack>
    struct empty_stack: std::exception
    {
      const char* what() const throw();//(sai:const throw() is an exception specification, if empty inside () means no longer throwing an exception, if int, it means an exception that can throw an int.)
    };
    template<typename T>
    class threadsafe_stack
    {
    private:
      std::stack<T> data;
      mutable std::mutex m;
    public:
      threadsafe_stack(){}
      threadsafe_stack(const threadsafe_stack& other)
      {
        std::lock_guard<std::mutex> lock(other.m);
        data=other.data;                                               //<--1 Copy performed in constructor body
      }
      threadsafe_stack& operator=(const threadsafe_stack&) = delete;
      void push(T new_value)
      {
        std::lock_guard<std::mutex> lock(m);
        data.push(new_value);
      }
      std::shared_ptr<T> pop()
      {
        std::lock_guard<std::mutex> lock(m);
        if(data.empty()) throw empty_stack();                          //<--Check for empty before trying to pop value
        std::shared_ptr<T> const res(std::make_shared<T>(data.top())); //<--Allocate return value before modifying stack
        data.pop();
        return res;
      }
      void pop(T& value)
      {
        std::lock_guard<std::mutex> lock(m);
        if(data.empty()) throw empty_stack();
        value=data.top();
        data.pop();
      }
      bool empty() const
      {
        std::lock_guard<std::mutex> lock(m);
        return data.empty();
      }
    };
    

Deadlock: the problem and a solution

Be careful:

  1. When locking more than one mutex at a time, remember that locking in the same order does not cause deadlock problems. However, this practice can be difficult to maintain. If you are negligent or unknown to others, it is easy to change the locking order. This causes deadlock problems.
  2. C++ provides an operation to lock multiple mutex es at once.std::lock.This operation will not cause a deadlock.
class some_big_object;
void swap(some_big_object& lhs,some_big_object& rhs);
class X
{
private:
  some_big_object some_detail;
  std::mutex m;
public:
  X(some_big_object const& sd):some_detail(sd){}
  friend void swap(X& lhs, X& rhs)
  {
    if(&lhs==&rhs)
      return;
    std::lock(lhs.m,rhs.m);                                    //<--1
    //(Lock multiple mutex es at once)
    std::lock_guard<std::mutex> lock_a(lhs.m,std::adopt_lock); //<--2
    std::lock_guard<std::mutex> lock_b(rhs.m,std::adopt_lock); //<--3
    //(std::adopt_lock parameter.You can tell std::lock_guard does not need to be locked because the lock is already fixed) because both have been locked with lock.
    swap(lhs.some_detail,rhs.some_detail);
  }
};

A scoped_lock provided by C++17 can lock all different types of mutex at once, and several mutexs. Here is an example of the overridden code.

void swap(X& lhs, X& rhs)
{
    if(&lhs==&rhs)
        return;
    std::scoped_lock guard(lhs.m,rhs.m); //Note that C++17's template type is used here for automatic derivation.
    swap(lhs.some_detail,rhs.some_detail);
}

Equivalent result of automatic template type derivation.

std::scoped_lock<std::mutex,std::mutex> guard(lhs.m,rhs.m);

Even with these mechanisms, deadlocks are inevitable. You need to be alert to your code.

Further guidelines for avoiding deadlock

If two threads call join to each other and wait for the other to complete, it also constitutes a deadlock.Multiple threads can also form deadlocks as long as the join call forms a ring.The guiding principle to avoid this is not to wait for one
Threads with potential threats.

AVOID NESTED LOCKS

Do not nest mutex

AVOID CALLING USER-SUPPLIED CODE WHILE HOLDING A LOCK

Do not call the user's code, because the user's code may go to get the mutex, which may result in deadlock.

ACQUIRE LOCKS IN A FIXED ORDER

  1. If mutex cannot be locked at the same time, it is best to lock in the same order in different threads
  2. For example, a two-way linked list. It's difficult to guarantee the lock order of deletion operations. Because traversal in both directions can occur. One way is to limit the traversal of the entire list. Only one direction is allowed. That way, the lock order is correct.

USE A LOCK HIERARCHY

If you want to implement mutex with hierarchy.
C++ doesn't offer these things. You need to do it yourself.

hierarchical_mutex high_level_mutex(10000);                 //<--1
hierarchical_mutex low_level_mutex(5000);                   //<--2
int do_low_level_stuff();
int low_level_func()                                   
{
  std::lock_guard<hierarchical_mutex> lk(low_level_mutex);  //<--3
  return do_low_level_stuff();
}

void high_level_stuff(int some_param);
void high_level_func()                                 
{
  std::lock_guard<hierarchical_mutex> lk(high_level_mutex); //<--4
  high_level_stuff(low_level_func());                       //<--5
}                                          
void thread_a()                                             //<--6
{
  high_level_func();
}
hierarchical_mutex other_mutex(100);                        //<--7
void do_other_stuff();
void other_stuff()                                     
{
  high_level_func();                                        //<--8
  do_other_stuff();
}
void thread_b()                                             //<--9
{
  std::lock_guard<hierarchical_mutex> lk(other_mutex);      //<--10
  other_stuff();
}

Implementation code:
To implement mutex yourself, you need to provide three excuses to manage it with RAII Manager. (lock,unlock,
try_lock)

class hierarchical_mutex                                  
{
  std::mutex internal_mutex;
  unsigned long const hierarchy_value;
  unsigned long previous_hierarchy_value;
  static thread_local unsigned long this_thread_hierarchy_value; //<--1

  void check_for_hierarchy_violation()                    
  {                                                       
    if(this_thread_hierarchy_value <= hierarchy_value)           //<--2
      {
        throw std::logic_error("mutex hierarchy violated");
      }
  }
  void update_hierarchy_value()
  {
    previous_hierarchy_value=this_thread_hierarchy_value;        //<--3
    this_thread_hierarchy_value=hierarchy_value;
  }
public:
  explicit hierarchical_mutex(unsigned long value):
    hierarchy_value(value),
    previous_hierarchy_value(0)
  {}
  void lock()
  {
    check_for_hierarchy_violation();
    internal_mutex.lock();                                       //<--4
    update_hierarchy_value();                                    //<--5
  }
  void unlock()
  {
    this_thread_hierarchy_value=previous_hierarchy_value;        //<--6
    internal_mutex.unlock();                              
  }
  bool try_lock()
  {
    check_for_hierarchy_violation();                      
    if(!internal_mutex.try_lock())                               //<--7
      return false;
    update_hierarchy_value();
    return true;
  }
};

thread_local unsigned long
  hierarchical_mutex::this_thread_hierarchy_value(ULONG_MAX);    //<--8

thread_local means that this variable is independent among threads.

EXTENDING THESE GUIDELINES BEYOND LOCKS

  1. It is not recommended to wait for other threads after a lock has been acquired.
  2. It is best to wait in place for the thread to end where it is called.
  3. If waiting threads are beneath a really hierarchical mutex, then threads of the same level can wait. Do not wait at different levels, if waiting must be wrong.

Flexible locking with std::unique_lock

The lock_guard operation mentioned earlier is very limited and you cannot unlock halfway. You need to wait until the scope ends. If you want to end the lock earlier, etc. or if you don't want to lock at the place of the constructor. You need to
unique_lock.

class some_big_object;
void swap(some_big_object& lhs,some_big_object& rhs);
class X
{
private:
  some_big_object some_detail;
  std::mutex m;
public:
  X(some_big_object const& sd):some_detail(sd){}
  friend void swap(X& lhs, X& rhs)
  {
    if(&lhs==&rhs)
      return;
    std::unique_lock<std::mutex> lock_a(lhs.m,std::defer_lock); // std::defer_lock does not lock here in the constructor. You need to lock it manually later.
    std::unique_lock<std::mutex> lock_b(rhs.m,std::defer_lock); 
    std::lock(lock_a,lock_b);                                   // Where it is actually locked.
    //(sai:unique_lock delays the flail and can be passed directly to the std::lock function because the unique_lock function provides the corresponding interface)
    swap(lhs.some_detail,rhs.some_detail);
  }
};
  1. Unique_lock is more flexible, but at the expense of efficiency.It's best to use lock_guard.The unique_lock is used in cases where the flail is delayed, such as the one mentioned in the above example
  2. If you want to transfer ownership of mutex, you should use unique_lock.
  3. If you can use C++17 you can replace unique_lock with std::scrope_lock.

Transferring mutex ownership between scopes

std::unique_lock is movable but not copyable.
But lock_guard is neither movable nor copyable.

std::unique_lock<std::mutex> get_lock()
{
  extern std::mutex some_mutex;
  std::unique_lock<std::mutex> lk(some_mutex);
  prepare_data();
  return lk;                                   //<--1
}
void process_data()
{
  std::unique_lock<std::mutex> lk(get_lock()); //<--2
  do_something();
}

Locking at an appropriate granularity

Do not do IO operations inside mutex; they are hundreds or thousands of times slower than memory operations that read the same amount of data

void get_and_process_data()
{
  std::unique_lock<std::mutex> my_lock(the_mutex);
  some_class data_to_process=get_next_data_chunk();
  my_lock.unlock(); //<-- 1 Don't need mutex locked across call to process()
  result_type result=process(data_to_process);
  my_lock.lock();   //<-- 2 Relock mutex to write result
  write_result(data_to_process,result);
}

After acquiring a lock, do not perform additional tasks that do not require lock participation, and do not wait for IO operations, etc.

class Y
{
private:
int some_detail;
  mutable std::mutex m;
  int get_detail() const
  {
    std::lock_guard<std::mutex> lock_a(m); //<--1
    return some_detail;
  }
public:
  Y(int sd):some_detail(sd){}
  friend bool operator==(Y const& lhs, Y const& rhs)
  {
    if(&lhs==&rhs)
      return true;
    int const lhs_value=lhs.get_detail();  //<--2
    int const rhs_value=rhs.get_detail();  //<--3
    return lhs_value==rhs_value;           //<--4
  }
};
  1. The code above has obvious problems and is too fine-grained. So the content in the.lhs and rhs gaps between get_detail s can be modified by other threads.
  2. Coarse granularity can also make multithreading completely single-threaded.
131 original articles were published, 11 were praised, 10,000 visits+
Private letter follow

Tags: Programming

Posted on Sun, 08 Mar 2020 21:24:54 -0400 by mothermugger