Why use distributed lock --Distributed lock?
Before discussing this issue, let's take a look at a business scenario:
Inventory oversold
System A is an e-commerce system. At present, it is A machine deployment. There is an interface for users to place orders in the system. However, before users place orders, they must check the inventory to ensure that the inventory is sufficient before placing orders for users.
Because the system has certain concurrency, the inventory of goods will be saved in redis in advance, and users will update the redis inventory when they place an order.
At this time, the system architecture is as follows:
But this will cause a problem: if at a certain time, the inventory of a commodity in redis is 1, and two requests come at the same time, one of which is executed in step 3 above, and the inventory of the database is updated to 0, but step 4 has not been executed yet.
When another request reaches step 2, and the inventory is still 1, continue to step 3.
As a result, two products were sold, but only one was in stock.
Obviously not! This is a typical oversold problem
At this point, it's easy to think of a solution: Lock steps 2, 3 and 4 with a lock, and let them finish executing, then another thread can come in to execute step 2.
According to the above figure, when step 2 is executed, use the synchronized or ReentrantLock provided by Java to lock, and then release the lock after step 4 is finished.
In this way, the steps 2, 3 and 4 are locked, and multiple threads can only be executed serially.
Service cluster deployment
But the good times are not long. The concurrency of the whole system is soaring, and one machine can't bear it. Now you need to add a machine, as shown below:
After adding the machine, the system changes to the figure above, my God!
If the requests of two users come at the same time, but they fall on different machines, then the two requests can be executed at the same time, or the problem of inventory oversold will occur.
Why?
- Because the two A systems in the figure above are running in two different JVMs, the locks they add are only valid for the threads in their own JVMs, but not for the threads in other JVMs.
- Therefore, the problem here is that the native lock mechanism provided by Java fails in the multi machine deployment scenario because the locks added by two machines are not the same (two locks are in different JVM s).
- So, we just need to make sure that the lock added by two machines is the same lock. Isn't the problem solved?
At this time, it's time for the distributed lock to debut. The idea of the distributed lock is:
The whole system provides a global and unique "thing" to acquire the lock. Then, when each system needs to add a lock, it asks the "thing" to obtain a lock, so that different systems can get the same lock.
As for this "thing", it can be Redis, Zookeeper or database.
The text description is not very intuitive. Let's look at the following figure:
Through the above analysis, we know that in the inventory oversold scenario, using Java Native lock mechanism can not guarantee thread safety in the distributed deployment system, so we need to use the distributed lock scheme.
So, how to implement distributed locks? Then look down!
Implementation of distributed lock based on Redis
In the above analysis, why do we need to use distributed locks? Let's see how to handle it when the distributed locks land. Extension: How does Redisson implement distributed locks?
The most common solution is to use Redis as a distributed lock
The idea of using redis for distributed lock is as follows: set a value in redis to indicate that the lock is added, and then delete the key when releasing the lock.
The specific code is as follows:
//Acquire lock //NX means that if the key does not exist, it succeeds. If the key exists, it returns false. PX can specify the expiration time SET anyLock unique\_value NX PX 30000 //Release lock: by executing a lua script //Releasing a lock involves two instructions, which are not atomic //You need to use the lua script support feature of redis. It is atomic for redis to execute lua scripts if redis.call("get",KEYS\[1\]) == ARGV\[1\] then return redis.call("del",KEYS\[1\]) else return0 end
There are several key points to this approach:
- Be sure to use the SET key value NX PX milliseconds command
If not, set the value first, and then set the expiration time. This is not an atomic operation. It is possible to shut down before setting the expiration time, resulting in a deadlock (the key exists permanently)
- value should be unique
This is to verify that the value is consistent with the locked value before deleting the key when unlocking.
This is to avoid a situation: suppose a acquires the lock with an expiration time of 30s. After 35s, the lock has been released automatically. A releases the lock, but B may acquire the lock at this time. A client can't delete B's lock.
In addition to how to implement distributed locks for clients, redis deployment should also be considered.
redis can be deployed in three ways:
- standalone mode
- Master slave + sentinel election mode
- redis cluster mode
The disadvantage of using redis as a distributed lock is that if you use the stand-alone deployment mode, there will be a single point of problem, as long as redis fails. You can't lock it.
In the master slave mode, only one node is locked when it is locked. Even though it is highly available through sentinel, if the master node fails and a master-slave switch occurs, the lock loss may occur.
Based on the above considerations, in fact, the author of redis also considers this problem. He proposes a RedLock algorithm, which means something like this:
Suppose that the deployment mode of redis is redis cluster. There are 5 master nodes in total. Obtain a lock through the following steps:
- Gets the current timestamp in milliseconds
- Try to create locks on each master node in turn. The expiration time setting is short, usually tens of milliseconds
- Try to establish a lock on most nodes, for example, 5 nodes require 3 nodes (n / 2 +1)
- The client calculates the time to set up the lock. If the time to set up the lock is less than the timeout, the establishment is successful
- If the lock creation fails, delete the lock in turn
- As long as someone has set up a distributed lock, you have to poll constantly to try to acquire the lock
However, such an algorithm is still controversial, and there may be many problems, which can not ensure that the locking process is correct.
Another way: Redisson
In addition, to implement Redis's distributed locks based on redis client's native api, you can also use the open source framework: Redission
Redisson is an enterprise level open source Redis Client, which also provides distributed lock support. I also recommend you to use it. Why?
Recall that if you write your own code to set a value through redis, it is set through the following command.
- SET anyLock unique_value NX PX 30000
The timeout set here is 30s. If I haven't completed the business logic for more than 30s, the key will expire and other threads may get the lock.
In this way, the first thread has not finished executing the business logic, and the second thread will have thread safety problems when it comes in. So we need to maintain the expiration time extra. It's too troublesome~
Let's see how redisson is implemented? First of all, let's feel the cool of using redission:
Config config = new Config(); config.useClusterServers() .addNodeAddress("redis://192.168.31.101:7001") .addNodeAddress("redis://192.168.31.101:7002") .addNodeAddress("redis://192.168.31.101:7003") .addNodeAddress("redis://192.168.31.102:7001") .addNodeAddress("redis://192.168.31.102:7002") .addNodeAddress("redis://192.168.31.102:7003"); RedissonClient redisson = Redisson.create(config); RLock lock = redisson.getLock("anyLock"); lock.lock(); lock.unlock();
It's so simple. We only need lock and unlock in its api to complete the distributed lock. It helps us consider many details:
- All instructions of redisson are executed through Lua script. redis supports the atomic execution of lua script
- redisson sets the default expiration time of a key to 30s. What if a client holds a lock for more than 30s?
There is a watchdog concept in redisson, which translates into a watchdog. It will help you set the key timeout to 30s every 10 seconds after you get the lock
In this way, even if the lock is always held, the key will not expire, and other threads will get the lock.
- redisson's "watchdog" logic ensures that no deadlock occurs.
(if the machine goes down, the watchdog will be gone. At this time, the key expiration time will not be extended. After 30s, the key will automatically expire. Other threads can obtain the lock.)
The implementation code is pasted here:
// Lock logic private <T> RFuture<Long> tryAcquireAsync(long leaseTime, TimeUnit unit, final long threadId) { if (leaseTime != -1) { return tryLockInnerAsync(leaseTime, unit, threadId, RedisCommands.EVAL\_LONG); } // Call a lua script to set some key s and expiration time RFuture<Long> ttlRemainingFuture = tryLockInnerAsync(commandExecutor.getConnectionManager().getCfg().getLockWatchdogTimeout(), TimeUnit.MILLISECONDS, threadId, RedisCommands.EVAL\_LONG); ttlRemainingFuture.addListener(new FutureListener<Long>() { @Override public void operationComplete(Future<Long> future) throws Exception { if (!future.isSuccess()) { return; } Long ttlRemaining = future.getNow(); // lock acquired if (ttlRemaining == null) { // Watchdog logic scheduleExpirationRenewal(threadId); } } }); return ttlRemainingFuture; } <T> RFuture<T> tryLockInnerAsync(long leaseTime, TimeUnit unit, long threadId, RedisStrictCommand<T> command) { internalLockLeaseTime = unit.toMillis(leaseTime); return commandExecutor.evalWriteAsync(getName(), LongCodec.INSTANCE, command, "if (redis.call('exists', KEYS\[1\]) == 0) then " + "redis.call('hset', KEYS\[1\], ARGV\[2\], 1); " + "redis.call('pexpire', KEYS\[1\], ARGV\[1\]); " + "return nil; " + "end; " + "if (redis.call('hexists', KEYS\[1\], ARGV\[2\]) == 1) then " + "redis.call('hincrby', KEYS\[1\], ARGV\[2\], 1); " + "redis.call('pexpire', KEYS\[1\], ARGV\[1\]); " + "return nil; " + "end; " + "return redis.call('pttl', KEYS\[1\]);", Collections.<Object>singletonList(getName()), internalLockLeaseTime, getLockName(threadId)); } // The watchdog will eventually call here private void scheduleExpirationRenewal(final long threadId) { if (expirationRenewalMap.containsKey(getEntryName())) { return; } // This task will be delayed for 10s Timeout task = commandExecutor.getConnectionManager().newTimeout(new TimerTask() { @Override public void run(Timeout timeout) throws Exception { // This operation will reset the key expiration time to 30s RFuture<Boolean> future = renewExpirationAsync(threadId); future.addListener(new FutureListener<Boolean>() { @Override public void operationComplete(Future<Boolean> future) throws Exception { expirationRenewalMap.remove(getEntryName()); if (!future.isSuccess()) { log.error("Can't update lock " + getName() + " expiration", future.cause()); return; } if (future.getNow()) { // reschedule itself // Call this method recursively to extend the expiration time scheduleExpirationRenewal(threadId); } } }); } }, internalLockLeaseTime / 3, TimeUnit.MILLISECONDS); if (expirationRenewalMap.putIfAbsent(getEntryName(), new ExpirationEntry(threadId, task)) != null) { task.cancel(); } }
In addition, redisson also provides support for the redlock algorithm,
Its use is also simple:
RedissonClient redisson = Redisson.create(config); RLock lock1 = redisson.getFairLock("lock1"); RLock lock2 = redisson.getFairLock("lock2"); RLock lock3 = redisson.getFairLock("lock3"); RedissonRedLock multiLock = new RedissonRedLock(lock1, lock2, lock3); multiLock.lock(); multiLock.unlock();
Summary:
This section analyzes the specific landing scheme of using redis as a distributed lock
And some of its limitations
Then it introduces a redis son client framework,
This is what I recommend you to use,
There will be less care and many details than writing your own code.
Implementation of distributed lock based on zookeeper
In common distributed lock implementation schemes, in addition to redis, zookeeper can also be used to implement distributed locks.
Before introducing the mechanism of implementing distributed lock with zookeeper (hereinafter, zk is used instead), let's briefly introduce what zk is:
Zookeeper is a centralized service that provides configuration management, distributed collaboration and naming.
The model of zk is as follows: zk contains a series of nodes, called znode, just like the file system, each znode represents a directory, and then znode has some characteristics:
- Ordered node: if a parent node is / lock, we can create a child node under the parent node;
Zookeeper provides an optional ordering feature. For example, we can create a child node "/ lock/node -" and indicate ordering. When zookeeper generates a child node, it will automatically add an integer sequence number according to the current number of child nodes
That is to say, if it is the first created child node, the generated child node is / lock / node 000000000, the next node is / lock / node 000000000, and so on.
- Temporary node: the client can establish a temporary node. After the session ends or the session times out, zookeeper will automatically delete the node.
- Event monitoring: when reading data, we can set event monitoring for nodes at the same time. When the node data or structure changes, zookeeper will notify the client. Currently, zookeeper has four events as follows:
- Node creation
- Node deletion
- Node data modification
- Child node change
Based on the above characteristics of zk, we can easily get the landing scheme of using zk to implement distributed lock:
- Using zk's temporary nodes and ordered nodes, each thread acquires a lock by creating a temporary ordered node in zk, such as in the / lock / directory.
- After the node is created successfully, obtain all the temporary nodes in the / lock directory, and then judge whether the node created by the current thread is the node with the lowest sequence number of all nodes
- If the node created by the current thread is the node with the smallest sequence number of all nodes, the lock acquisition is considered successful.
- If the node created by the current thread is not the node with the smallest sequence number of all nodes, add an event listener to the previous node of the node sequence number.
For example, if the node sequence number obtained by the current thread is / lock/003, and then the list of all nodes is [lock/001,/lock/002,/lock/003], an event listener will be added to the / lock/002 node.
If the lock is released, the node with the next sequence number will be waked up, and then step 3 will be performed again to determine whether its own node sequence number is the minimum.
For example, / lock/001 is released, and / lock/002 listens to the time. At this time, the node set is [lock/002,/lock/003], and / lock/002 is the minimum Sn node to acquire the lock.
The whole process is as follows:
This is the specific way of implementation. As for how to write the code, it's more complex here, so we won't post it.
Introduction to Curator
Cursor is an open source client of zookeeper, and also provides the implementation of distributed locks.
His usage is also relatively simple:
InterProcessMutex interProcessMutex = new InterProcessMutex(client,"/anyLock"); interProcessMutex.acquire(); interProcessMutex.release();
The core source code of the distributed lock is as follows:
private boolean internalLockLoop(long startMillis, Long millisToWait, String ourPath) throws Exception { boolean haveTheLock = false; boolean doDelete = false; try { if ( revocable.get() != null ) { client.getData().usingWatcher(revocableWatcher).forPath(ourPath); } while ( (client.getState() == CuratorFrameworkState.STARTED) && !haveTheLock ) { //Get the sorted collection of all current nodes List<String> children = getSortedChildren(); //Get the name of the current node String sequenceNodeName = ourPath.substring(basePath.length() + 1); // +1 to include the slash //Determine whether the current node is the smallest node PredicateResults predicateResults = driver.getsTheLock(client, children, sequenceNodeName, maxLeases); if ( predicateResults.getsTheLock() ) { //Get lock haveTheLock = true; } else { //No lock obtained, register a listener for the previous node of the current node String previousSequencePath = basePath + "/" + predicateResults.getPathToWatch(); synchronized(this){ Stat stat = client.checkExists().usingWatcher(watcher).forPath(previousSequencePath); if ( stat != null ){ if ( millisToWait != null ){ millisToWait -= (System.currentTimeMillis() - startMillis); startMillis = System.currentTimeMillis(); if ( millisToWait <= 0 ){ doDelete = true; // timed out - delete our node break; } wait(millisToWait); }else{ wait(); } } } // else it may have been deleted (i.e. lock released). Try to acquire again } } } catch ( Exception e ) { doDelete = true; throw e; } finally{ if ( doDelete ){ deleteOurPath(ourPath); } } return haveTheLock; }
In fact, the underlying principle of the implementation of distributed locks by the cursor is similar to the one analyzed above. Here we use a diagram to describe its principle in detail:
Summary:
This section introduces the scheme of implementing distributed lock by zookeeper and the basic use of zk's open source client, and briefly introduces its implementation principle. Please refer to: Let's take a look at ZooKeeper's implementation of distributed lock, with an example!
Comparison of advantages and disadvantages between the two schemes
After learning the two implementation schemes of distributed lock, this section needs to discuss the advantages and disadvantages of redis and zk.
For redis's distributed lock, it has the following disadvantages:
- It obtains locks in a simple and crude way, and tries to acquire locks directly without obtaining locks, which consumes performance.
- In other words, the design orientation of redis determines that its data is not highly consistent. In some extreme cases, problems may occur. The lock model is not robust enough
- Even if we use the redlock algorithm to implement it, in some complex scenarios, we can't guarantee that it can be implemented 100% without problems. For the discussion of redlock, see How to do distributed locking
- In fact, redis distributed locks need to constantly try to acquire locks and consume performance.
But on the other hand, using redis to implement distributed lock is very common in many enterprises, and in most cases, we will not encounter the so-called "extremely complex scenario"
Therefore, using redis as a distributed lock is a good solution. The most important point is that redis has a high performance, which can support high concurrent lock acquisition and release operations.
For zk distributed locks:
- The natural design orientation of zookeeper is distributed coordination and strong consistency. The model of lock is robust, easy to use and suitable for distributed lock.
- If you can't get the lock, you just need to add a listener. You don't need to poll all the time, so the performance consumption is small.
But zk also has its disadvantages: if more clients frequently apply for lock and release lock, the pressure on zk cluster will be greater.
Summary:
To sum up, redis and zookeeper have their advantages and disadvantages. We can take these problems as reference factors when we do technology selection.
proposal
Through the previous analysis, there are two common solutions to implement distributed locks: redis and zookeeper. They have their own advantages. How to choose the model?
Personally, I prefer the lock implemented by zk:
Because redis may have hidden dangers, which may lead to incorrect data. However, how to choose depends on the specific scenario in the company.
If there is zk cluster condition in the company, zk is preferred. But if there is only redis cluster in the company, there is no condition to build zk cluster.
In addition, the system designer may consider that the system already has redis, but he does not want to introduce some external dependencies again, so he can choose redis.
This is to be considered by the system designer based on architecture