\Today, I will continue to serve you a hard dish, which is a very important technical solution in payment. Students with this business should pay attention to trying it on their own!

In development, we often encounter some requirements for delayed tasks. for example
- If the order has not been paid for 30 minutes, it will be automatically cancelled
- Send a text message to the user 60 seconds after the order is generated
For the above tasks, we give a professional name to describe, that is, delayed tasks.
Then a question arises here. What is the difference between delayed tasks and scheduled tasks?
There are three differences:
- Timed tasks have a clear trigger time, and delayed tasks do not
- A scheduled task has an execution cycle, while a delayed task executes within a period of time after an event is triggered, and there is no execution cycle
- Scheduled tasks generally perform batch operations, which are multiple tasks, while delayed tasks are generally single tasks
Next, let's take judging whether the order times out as an example to analyze the scheme.
Scheme analysis
1) Database polling
thinking
This scheme is usually used in small projects, that is, a thread scans the database regularly, judges whether there are overtime orders through the order time, and then performs operations such as update or delete
realization
During the internship, I realized it with quartz. Let's briefly introduce it.
The maven project introduces a dependency as follows
<dependency> <groupId>org.quartz-scheduler</groupId> <artifactId>quartz</artifactId> <version>2.2.2</version> </dependency>
Call Demo class MyJob:
public class MyJob implements Job { public void execute(JobExecutionContext context) throws JobExecutionException { System.out.println("I'm going to scan the database..."); } public static void main(String[] args) throws Exception { // Create task JobDetail jobDetail = JobBuilder.newJob(MyJob.class) .withIdentity("job1", "group1").build(); // The create trigger is executed every 3 seconds Trigger trigger = TriggerBuilder .newTrigger() .withIdentity("trigger1", "group3") .withSchedule( SimpleScheduleBuilder.simpleSchedule() .withIntervalInSeconds(3).repeatForever()) .build(); Scheduler scheduler = new StdSchedulerFactory().getScheduler(); // Put the task and its trigger into the scheduler scheduler.scheduleJob(jobDetail, trigger); // The scheduler starts scheduling tasks scheduler.start(); } }
Run the code and find that every 3 seconds, the output is as follows:
I'm going to scan the database...
Advantages: it is easy to operate and supports cluster operation
Disadvantages:
- High memory consumption on the server
- There is a delay. For example, if you scan every 3 minutes, the worst delay is 3 minutes
- Suppose you have tens of millions of orders and scan them every few minutes, which will cause great loss of the database
2) Delay queue for JDK
thinking
This is implemented by using the DelayQueue of JDK. This is an unbounded blocking queue. The queue can get elements from it only when the delay expires. The objects put into the DelayQueue must implement the Delayed interface.
The workflow of DelayedQueue implementation is shown in the following figure:

- Poll(): gets and removes the timeout element of the queue. If not, it returns null
- take(): get and remove the timeout element of the queue. If not, wait for the current thread until an element meets the timeout condition and return the result.
realization
Define a class OrderDelay to implement Delayed:
public class OrderDelay implements Delayed { private String orderId; private long timeout; OrderDelay(String orderId, long timeout) { this.orderId = orderId; this.timeout = timeout + System.nanoTime(); } public int compareTo(Delayed other) { if (other == this) return 0; OrderDelay t = (OrderDelay) other; long d = (getDelay(TimeUnit.NANOSECONDS) - t .getDelay(TimeUnit.NANOSECONDS)); return (d == 0) ? 0 : ((d < 0) ? -1 : 1); } // Returns how much time is left before your custom timeout public long getDelay(TimeUnit unit) { return unit.convert(timeout - System.nanoTime(),TimeUnit.NANOSECONDS); } void print() { System.out.println(orderId+"The order number is to be deleted...."); } }
For the test Demo, we set the delay time to 3 seconds:
public class DelayQueueDemo { public static void main(String[] args) { List<String> list = new ArrayList<String>(); list.add("00000001"); list.add("00000002"); list.add("00000003"); list.add("00000004"); list.add("00000005"); DelayQueue<OrderDelay> queue = newDelayQueue<OrderDelay>(); long start = System.currentTimeMillis(); for(int i = 0;i<5;i++){ //Take out with three seconds delay queue.put(new OrderDelay(list.get(i), TimeUnit.NANOSECONDS.convert(3,TimeUnit.SECONDS))); try { queue.take().print(); System.out.println("After " + (System.currentTimeMillis()-start) + " MilliSeconds"); } catch (InterruptedException e) {} } } }
The output is as follows:
00000001 The order number is to be deleted.... After 3003 MilliSeconds 00000002 The order number is to be deleted.... After 6006 MilliSeconds 00000003 The order number is to be deleted.... After 9006 MilliSeconds 00000004 The order number is to be deleted.... After 12008 MilliSeconds 00000005 The order number is to be deleted.... After 15009 MilliSeconds
You can see that the delay is 3 seconds and the order is deleted.
Advantages: high efficiency and low task trigger time delay.
Disadvantages:
- After the server restarts, all data disappears for fear of downtime
- Cluster expansion is quite troublesome
- Due to memory constraints, such as too many unpaid orders, OOM exceptions are easy to occur
- High code complexity
3) Time wheel algorithm
thinking
Start with the diagram of the previous time wheel:

The time wheel algorithm can be similar to the clock. As shown in the figure above, the arrow (pointer) rotates at a fixed frequency in a certain direction. Each jump is called a tick.
It can be seen that the timing wheel consists of three important attribute parameters:
- Tickesperwheel (number of tick s in a round)
- tickDuration (duration of a tick)
- timeUnit (time unit)
For example, when tickesperwheel = 60, tickeduration = 1, timeUnit = seconds, this is exactly similar to the constant second hand walking in reality.
If the current pointer is above 1 and I have a task that needs to be executed in 4 seconds, the thread callback or message will be placed on 5. What if it needs to be executed after 20 seconds? Because the number of slots in this ring structure is only 8, if it takes 20 seconds, the pointer needs to rotate 2 more turns. Position is above 5 after 2 turns (20% 8 + 1)
realization
We use Netty's HashedWheelTimer to implement it.
Add the following dependencies to pom.xml:
<dependency> <groupId>io.netty</groupId> <artifactId>netty-all</artifactId> <version>4.1.24.Final</version> </dependency>
Test code HashedWheelTimerTest:
public class HashedWheelTimerTest { static class MyTimerTask implements TimerTask{ boolean flag; public MyTimerTask(boolean flag){ this.flag = flag; } public void run(Timeout timeout) throws Exception { System.out.println("I'm going to delete the order in the database...."); this.flag =false; } } public static void main(String[] argv) { MyTimerTask timerTask = new MyTimerTask(true); Timer timer = new HashedWheelTimer(); timer.newTimeout(timerTask, 5, TimeUnit.SECONDS); int i = 1; while(timerTask.flag){ try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } System.out.println(i+"Seconds have passed"); i++; } } }
The output is as follows:
1 Seconds have passed 2 Seconds have passed 3 Seconds have passed 4 Seconds have passed 5 Seconds have passed I'm going to delete the order in the database.... 6 Seconds have passed
Advantages: high efficiency, lower task trigger time delay time than delayQueue, and lower code complexity than delayQueue.
Disadvantages:
- After the server restarts, all data disappears for fear of downtime
- Cluster expansion is quite troublesome
- Due to memory constraints, such as too many unpaid orders, OOM exceptions are easy to occur
4) redis cache
Train of thought I
Use zset of redis. zset is an ordered set. Each element (member) is associated with a score. The values in the set are obtained by sorting the scores.
- Add element: ZADD key score member [[score member] [score member]...]
- Query elements in order: zrange key start stop [with scores]
- Query element score: ZSCORE key member
- Remove element: ZREM key member [member...]
The tests are as follows:
Add a single element redis> ZADD page_rank 10 google.com (integer) 1 Add multiple elements redis> ZADD page_rank 9 baidu.com 8 bing.com (integer) 2 redis> ZRANGE page_rank 0 -1 WITHSCORES 1) "bing.com" 2) "8" 3) "baidu.com" 4) "9" 5) "google.com" 6) "10" Of query elements score value redis> ZSCORE page_rank bing.com "8" Remove a single element redis> ZREM page_rank google.com (integer) 1 redis> ZRANGE page_rank 0 -1 WITHSCORES 1) "bing.com" 2) "8" 3) "baidu.com" 4) "9"
So how to achieve it? We set the order timeout timestamp and order number as score and member respectively, and the system scans the first element to determine whether the timeout occurs, as shown in the following figure:

Realize one
public class AppTest { private static final String ADDR = "127.0.0.1"; private static final int PORT = 6379; private static JedisPool jedisPool = new JedisPool(ADDR, PORT); public static Jedis getJedis() { return jedisPool.getResource(); } //Producer, generate 5 orders and put them in public void productionDelayMessage(){ for(int i=0;i<5;i++){ //Delay 3 seconds Calendar cal1 = Calendar.getInstance(); cal1.add(Calendar.SECOND, 3); int second3later = (int) (cal1.getTimeInMillis() / 1000); AppTest.getJedis().zadd("OrderId",second3later,"OID0000001"+i); System.out.println(System.currentTimeMillis()+"ms:redis An order task was generated: ID by"+"OID0000001"+i); } } //Consumer, take order public void consumerDelayMessage(){ Jedis jedis = AppTest.getJedis(); while(true){ Set<Tuple> items = jedis.zrangeWithScores("OrderId", 0, 1); if(items == null || items.isEmpty()){ System.out.println("There are currently no tasks waiting"); try { Thread.sleep(500); } catch (InterruptedException e) { e.printStackTrace(); } continue; } int score = (int) ((Tuple)items.toArray()[0]).getScore(); Calendar cal = Calendar.getInstance(); int nowSecond = (int) (cal.getTimeInMillis() / 1000); if(nowSecond >= score){ String orderId = ((Tuple)items.toArray()[0]).getElement(); jedis.zrem("OrderId", orderId); System.out.println(System.currentTimeMillis() +"ms:redis Consume a task: consume orders OrderId by"+orderId); } } } public static void main(String[] args) { AppTest appTest =new AppTest(); appTest.productionDelayMessage(); appTest.consumerDelayMessage(); } }
Corresponding output:

You can see that almost all of them are consumer orders after 3 seconds.
However, there is a fatal flaw in this version. Under the condition of high concurrency, multiple consumers will get the same order number. Our test code ThreadTest:
public class ThreadTest { private static final int threadNum = 10; private static CountDownLatch cdl = newCountDownLatch(threadNum); static class DelayMessage implements Runnable{ public void run() { try { cdl.await(); } catch (InterruptedException e) { e.printStackTrace(); } AppTest appTest =new AppTest(); appTest.consumerDelayMessage(); } } public static void main(String[] args) { AppTest appTest =new AppTest(); appTest.productionDelayMessage(); for(int i=0;i<threadNum;i++){ new Thread(new DelayMessage()).start(); cdl.countDown(); } } }` The output is as follows:  Obviously, multiple threads consume the same resource. **Solution** - With distributed locks, but with distributed locks, the performance decreases. This scheme will not be described in detail. - yes ZREM It is judged by the return value of. Data is consumed only when it is greater than 0, so consumerDelayMessage()In the method ```java if(nowSecond >= score){ String orderId = ((Tuple)items.toArray()[0]).getElement(); jedis.zrem("OrderId", orderId); System.out.println(System.currentTimeMillis()+"ms:redis Consume a task: consume orders OrderId by"+orderId); }
Amend to read:
if(nowSecond >= score){ String orderId = ((Tuple)items.toArray()[0]).getElement(); Long num = jedis.zrem("OrderId", orderId); if( num != null && num>0){ System.out.println(System.currentTimeMillis()+"ms:redis Consume a task: consume orders OrderId by"+orderId); } }
After this modification, rerun the ThreadTest class and find that the output is normal.
Train of thought II
This scheme uses redis's Keyspace Notifications. The Chinese translation is the key space mechanism, which can provide a callback after the key fails. In fact, redis will send a message to the client. Yes, redis version 2.8 or above is required.
Implementation II
In redis.conf, add a configuration:
notify-keyspace-events Ex
The operation code is as follows:
public class RedisTest { private static final String ADDR = "127.0.0.1"; private static final int PORT = 6379; private static JedisPool jedis = new JedisPool(ADDR, PORT); private static RedisSub sub = new RedisSub(); public static void init() { new Thread(new Runnable() { public void run() { jedis.getResource().subscribe(sub, "__keyevent@0__:expired"); } }).start(); } public static void main(String[] args) throws InterruptedException { init(); for(int i =0;i<10;i++){ String orderId = "OID000000"+i; jedis.getResource().setex(orderId, 3, orderId); System.out.println(System.currentTimeMillis()+"ms:"+orderId+"Order generation"); } } static class RedisSub extends JedisPubSub { public void onMessage(String channel, String message) { System.out.println(System.currentTimeMillis()+"ms:"+message+"Order cancellation"); } } }
The output is as follows:

It is obvious that the order was cancelled after 3 seconds.
However, the pub/sub mechanism of redis has a hard problem. The content of the official website is as follows
Because Redis Pub/Sub is fire and forget currently there is no way to use this feature if your application demands reliable notification of events, that is, if your Pub/Sub client disconnects, and reconnects later, all the events delivered during the time the client was disconnected are lost.
Redis's publish / subscribe is currently in fire and forget mode, so reliable notification of events cannot be realized. That is, if the publish / subscribe client is disconnected and reconnected, all events during the client disconnection are lost.
Therefore, option 2 is not recommended. Of course, if you don't require high reliability, you can use it.
advantage:
- Because Redis is used as the message channel, all messages are stored in Redis. If the sender or task handler hangs, it is possible to reprocess the data after restarting.
- Cluster expansion is quite convenient
- High time accuracy
Disadvantages: additional redis maintenance is required
5) Using message queuing
The delay queue of rabbitmq can be used. Rabbitmq has the following two features to implement delay queues:
- RabbitMQ can set x-message-tt for Queue and Message to control the Message lifetime. If it times out, the Message becomes dead letter
- lRabbitMQ's Queue can be configured with two parameters, x-dead-letter-exchange and x-dead-letter-routing-key (optional), to control the rerouting of deadletter in the Queue according to these two parameters.
Combining the above two characteristics, the function of delayed message can be simulated. Specifically, I'll write another article another day. It's too long to talk about it here.
Advantages: high efficiency, easy horizontal expansion by using the distributed characteristics of rabbitmq, message persistence support and increased reliability.
Disadvantages: its ease of use depends on the operation and maintenance of rabbitMq. Because rabbitMq is referenced, the complexity and cost become higher.
Reference link: https://blog.csdn.net/hjm4702192/article/details/80519010