A reusable distributed transaction message architecture based on RabbitMQ

premise

Distributed transaction is a more difficult problem in the practice of microservice. In the microservice practice scheme implemented by the author, the compromise or avoid strong consistency scheme is adopted. Referring to the local message table scheme proposed by Ebay many years ago, a lightweight package based on RabbitMQ and MySQL (JDBC) is implemented to realize the transaction message module with low * * * performance. The content of this paper is to analyze the design and implementation of the whole scheme in detail. The environment depends on the following:

  • JDK1.8+

  • spring-boot-start-web:2.x.x

  • spring-boot-start-jdbc:2.x.x

  • spring-boot-start-amqp:2.x.x

  • HikariCP:3.x.x (spring boot start JDBC comes with it)

  • mysql-connector-java:5.1.48

  • redisson:3.12.1

Scheme design ideas

In principle, transaction messages are only suitable for weak consistency (or final consistency) scenarios. Common weak consistency scenarios are as follows:

  • The user service completes the registration action and pushes a marketing related message to the SMS service.

  • In the credit system, after the order service saves the order, it pushes an order record information to be approved to the approval service.

  • ......

In strong consistency scenarios, transaction messages should not be used.


In general, strong consistency requires strict synchronization, that is, all operations must succeed or fail at the same time, which will lead to additional consumption of synchronization.

If a transaction message module is designed reasonably, compensation, query, monitoring and other functions are completed. Because the system interaction is asynchronous, the overall throughput is higher than strict synchronization. In the business system that the author is responsible for, a basic principle based on the use of transaction message is also customized: under the premise that the message content is correct, the consumer needs to take care of its own when there is an exception.

Simply put: the upstream ensures its own business correctness and successfully pushes the correct message to RabbitMQ, which means that the upstream business is over.

In order to reduce the code's * * * nature, transaction messages need to rely on Spring's programmatic or declarative transactions. Programming transactions generally depend on TransactionTemplate, while declarative transactions rely on AOP module and annotation @ Transactional.

Then you need to customize a transaction message function module and add a transaction message record table (in fact, the local message table) to save every message record that needs to be sent. The main functions of the transaction message function module are:

  • Save the message record.

  • Push message to RabbitMQ server.

  • Query of message records, compensation push, etc.

Logical unit of transaction execution

In the logic unit of transaction execution, the transaction message record to be pushed needs to be saved, that is, the local (business) logic and the transaction message record saving operation are bound to the same transaction.


Sending a message to the RabbitMQ server needs to be delayed until the transaction is submitted, so as to ensure that the two operations of transaction submission and message sending to the RabbitMQ server are consistent.

In order to merge the two actions of saving the transaction message to be sent and sending the message to RabbitMQ into one action from the perspective of user awareness, a Spring specific transaction synchronizer is needed here
Transaction synchronization, here we analyze the callback location of the main methods of transaction synchronizer, mainly refer to
Abstractplatformtransactionmanager? Commit() or
Abstractplatformtransactionmanager? Processcommit() method:


The above figure only shows the scenario of correct transaction submission (excluding the exception scenario). It is clear here that the transaction synchronizer
The afterCommit() and afterCompletion(int status) methods of TransactionSynchronization are both at the real transaction commit point
Abstractplatformtransactionmanager ා docommit() is called back, so one of these two methods can be used to execute push message to RabbitMQ server. The overall pseudo code is as follows:

@Transactional
public Dto businessMethod(){
   business transaction code block ...
   // Save transaction message
   [saveTransactionMessageRecord()]
   // Register transaction synchronizer - push message to RabbitMQ in afterCommit() method
   [register TransactionSynchronization,send message in method afterCommit()]
   business transaction code block ...
}

In the above pseudo code, the two steps of saving transaction message and registering transaction synchronizer can be inserted anywhere in the transaction method, that is, independent of the execution order.

Compensation for transaction messages

Although the scenario in which the author suggested that the downstream service take care of its own service consumption abnormally was mentioned before, sometimes it was forced to push the corresponding message back from the upstream, which was a special scenario.

There is another scenario to consider: trigger transaction synchronizer after transaction commit
The afterCommit() method of TransactionSynchronization failed. This is a low probability scenario, but it will certainly appear in production. A typical reason is that after the transaction is submitted, it will be triggered in the future
The transactionsynchronization ා aftercommit() method to push the service instance is restarted.

As shown in the figure below:


In order to solve the problem of compensation push, the finite state is used to judge whether the message has been pushed successfully

  • In the transaction method, when the transaction message is saved, the push status of the message record is marked as being processed.

  • In the implementation of afterCommit() method of transaction synchronization interface, push the corresponding message to RabbitMQ, and then change the status of transaction message record to push successfully.

There is also a very special case where the RabbitMQ server itself fails and causes abnormal message push. In this case, it needs to retry (compensation push). Experience has proved that repeated retry in a short time is meaningless, and the failed service will not be recovered instantaneously. Therefore, it can be considered to use the exponential backoff algorithm to retry, and the maximum number of retries needs to be limited.


The index value, interval value and the upper limit of the maximum number of retries need to be set according to the actual situation, otherwise, problems such as excessive message delay or too frequent retries are likely to occur.

Scheme implementation

Introduce core dependencies:

<properties>
   <spring.boot.version>2.2.4.RELEASE</spring.boot.version>
   <redisson.version>3.12.1</redisson.version>
   <mysql.connector.version>5.1.48</mysql.connector.version>
</properties>
<dependencyManagement>
   <dependencies>
       <dependency>
           <groupId>org.springframework.boot</groupId>
           <artifactId>spring-boot-dependencies</artifactId>
           <version>${spring.boot.version}</version>
           <type>pom</type>
           <scope>import</scope>
       </dependency>
   </dependencies>
</dependencyManagement>
<dependencies>
   <dependency>
       <groupId>mysql</groupId>
       <artifactId>mysql-connector-java</artifactId>
       <version>${mysql.connector.version}</version>
   </dependency>
   <dependency>
       <groupId>org.springframework.boot</groupId>
       <artifactId>spring-boot-starter-web</artifactId>
   </dependency>
   <dependency>
       <groupId>org.springframework.boot</groupId>
       <artifactId>spring-boot-starter-jdbc</artifactId>
   </dependency>
   <dependency>
       <groupId>org.springframework.boot</groupId>
       <artifactId>spring-boot-starter-aop</artifactId>
   </dependency>
   <dependency>
       <groupId>org.springframework.boot</groupId>
       <artifactId>spring-boot-starter-amqp</artifactId>
   </dependency>
   <dependency>
       <groupId>org.redisson</groupId>
       <artifactId>redisson</artifactId>
       <version>${redisson.version}</version>
   </dependency>
</dependencies>

Spring boot starter JDBC, MySQL connector Java and spring boot starter AOP are MySQL transaction related, while spring boot starter AMQP is the encapsulation of RabbitMQ client. redisson mainly uses its distributed lock to compensate for the lock execution of timed tasks (to prevent multiple nodes from executing compensation push concurrently).

Table design

The transaction message module mainly involves two tables. Take MySQL as an example, the table DDL is as follows:

CREATE TABLE `t_transactional_message`
(
   id                  BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
   create_time         DATETIME        NOT NULL DEFAULT CURRENT_TIMESTAMP,
   edit_time           DATETIME        NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
   creator             VARCHAR(20)     NOT NULL DEFAULT 'admin',
   editor              VARCHAR(20)     NOT NULL DEFAULT 'admin',
   deleted             TINYINT         NOT NULL DEFAULT 0,
   current_retry_times TINYINT         NOT NULL DEFAULT 0 COMMENT 'Current retries',
   max_retry_times     TINYINT         NOT NULL DEFAULT 5 COMMENT 'max retries ',
   queue_name          VARCHAR(255)    NOT NULL COMMENT 'Queue name',
   exchange_name       VARCHAR(255)    NOT NULL COMMENT 'Switch name',
   exchange_type       VARCHAR(8)      NOT NULL COMMENT 'Exchange type',
   routing_key         VARCHAR(255) COMMENT 'Routing key',
   business_module     VARCHAR(32)     NOT NULL COMMENT 'Business module',
   business_key        VARCHAR(255)    NOT NULL COMMENT 'Business key',
   next_schedule_time  DATETIME        NOT NULL COMMENT 'Next scheduling time',
   message_status      TINYINT         NOT NULL DEFAULT 0 COMMENT 'Message status',
   init_backoff        BIGINT UNSIGNED NOT NULL DEFAULT 10 COMMENT 'Backoff initialization value,In seconds',
   backoff_factor      TINYINT         NOT NULL DEFAULT 2 COMMENT 'Retreat factor(That's the index)',
   INDEX idx_queue_name (queue_name),
   INDEX idx_create_time (create_time),
   INDEX idx_next_schedule_time (next_schedule_time),
   INDEX idx_business_key (business_key)
) COMMENT 'Transaction message table';

CREATE TABLE `t_transactional_message_content`
(
   id         BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
   message_id BIGINT UNSIGNED NOT NULL COMMENT 'Transaction message logging ID',
   content    TEXT COMMENT 'Message content'
) COMMENT 'Transaction message content table';

Because this module is likely to expand a background management module, it is necessary to store the message management and status related fields and large volume message content in two tables respectively, so as to avoid the problem of excessive IO utilization of MySQL service when querying message records in large quantities (this is a more reasonable scheme after discussion with the DBA team of the previous company). Two business fields are reserved_ Module and business_key is used to identify business modules and business keys (usually unique identification numbers, such as order numbers).


In general, if the service declares the binding relationship between the queue and the switch in advance through configuration, when sending RabbitMQ messages, it only depends on the two fields of exchangeName and routingKey (the switch of the header type is special and less used, which need not be taken into account for the moment). Considering that the service may miss the declaration operation, when sending messages, it will base on The first binding declaration is made on the queue and the relevant information is cached (the queue switch binding declaration in RabbitMQ will not throw an exception as long as the parameters of each declaration binding relationship are consistent).

Scheme code design

In the following scheme design description, the API design of message transaction management background is ignored temporarily, which can be supplemented later.

Define model entity classes TransactionalMessage and
TransactionalMessageContent:

@Data
public class TransactionalMessage {

   private Long id;
   private LocalDateTime createTime;
   private LocalDateTime editTime;
   private String creator;
   private String editor;
   private Integer deleted;
   private Integer currentRetryTimes;
   private Integer maxRetryTimes;
   private String queueName;
   private String exchangeName;
   private String exchangeType;
   private String routingKey;
   private String businessModule;
   private String businessKey;
   private LocalDateTime nextScheduleTime;
   private Integer messageStatus;
   private Long initBackoff;
   private Integer backoffFactor;
}

@Data
public class TransactionalMessageContent {

   private Long id;
   private Long messageId;
   private String content;
}

Then define the dao interface (the detailed code of implementation will not be expanded here for the time being. MySQL is used for storage. If you want to replace it with another type of database, you only need to use a different implementation):

public interface TransactionalMessageDao {

   void insertSelective(TransactionalMessage record);

   void updateStatusSelective(TransactionalMessage record);

   List<TransactionalMessage> queryPendingCompensationRecords(LocalDateTime minScheduleTime,
                                                              LocalDateTime maxScheduleTime,
                                                              int limit);
}

public interface TransactionalMessageContentDao {

   void insert(TransactionalMessageContent record);

   List<TransactionalMessageContent> queryByMessageIds(String messageIds);
}

Then define the transaction message service interface
TransactionalMessageService:

//External service interface
public interface TransactionalMessageService {

   void sendTransactionalMessage(Destination destination, TxMessage message);
}


@Getter
@RequiredArgsConstructor
public enum ExchangeType {

   FANOUT("fanout"),

   DIRECT("direct"),

   TOPIC("topic"),

   DEFAULT(""),

   ;

   private final String type;
}

//Destination to send message to
public interface Destination {

   ExchangeType exchangeType();

   String queueName();

   String exchangeName();

   String routingKey();
}

@Builder
public class DefaultDestination implements Destination {

   private ExchangeType exchangeType;
   private String queueName;
   private String exchangeName;
   private String routingKey;

   @Override
   public ExchangeType exchangeType() {
       return exchangeType;
   }

   @Override
   public String queueName() {
       return queueName;
   }

   @Override
   public String exchangeName() {
       return exchangeName;
   }

   @Override
   public String routingKey() {
       return routingKey;
   }
}

//Transaction message
public interface TxMessage {

   String businessModule();

   String businessKey();

   String content();
}

@Builder
public class DefaultTxMessage implements TxMessage {

   private String businessModule;
   private String businessKey;
   private String content;

   @Override
   public String businessModule() {
       return businessModule;
   }

   @Override
   public String businessKey() {
       return businessKey;
   }

   @Override
   public String content() {
       return content;
   }
}

//Message status
@RequiredArgsConstructor
public enum TxMessageStatus {

   /**
* success
    */
   SUCCESS(1),

   /**
* pending
    */
   PENDING(0),

   /**
* processing failed
    */
   FAIL(-1),

   ;

   private final Integer status;
}


The implementation class of TransactionalMessageService is the core function implementation of transaction message. The code is as follows:

@Slf4j
@Service
@RequiredArgsConstructor
public class RabbitTransactionalMessageService implements TransactionalMessageService {

   private final AmqpAdmin amqpAdmin;
   private final TransactionalMessageManagementService managementService;

   private static final ConcurrentMap<String, Boolean> QUEUE_ALREADY_DECLARE = new ConcurrentHashMap<>();

   @Override
   public void sendTransactionalMessage(Destination destination, TxMessage message) {
       String queueName = destination.queueName();
       String exchangeName = destination.exchangeName();
       String routingKey = destination.routingKey();
       ExchangeType exchangeType = destination.exchangeType();
       // Atomic pre declaration
       QUEUE_ALREADY_DECLARE.computeIfAbsent(queueName, k -> {
           Queue queue = new Queue(queueName);
           amqpAdmin.declareQueue(queue);
           Exchange exchange = new CustomExchange(exchangeName, exchangeType.getType());
           amqpAdmin.declareExchange(exchange);
           Binding binding = BindingBuilder.bind(queue).to(exchange).with(routingKey).noargs();
           amqpAdmin.declareBinding(binding);
           return true;
       });
       TransactionalMessage record = new TransactionalMessage();
       record.setQueueName(queueName);
       record.setExchangeName(exchangeName);
       record.setExchangeType(exchangeType.getType());
       record.setRoutingKey(routingKey);
       record.setBusinessModule(message.businessModule());
       record.setBusinessKey(message.businessKey());
       String content = message.content();
       // Save transaction message record
       managementService.saveTransactionalMessageRecord(record, content);
       // Register transaction synchronizer
       TransactionSynchronizationManager.registerSynchronization(new TransactionSynchronizationAdapter() {
           @Override
           public void afterCommit() {
               managementService.sendMessageSync(record, content);
           }
       });
   }
}

The management of message record state and content persistence is unified in
In TransactionalMessageManagementService:

@Slf4j
@RequiredArgsConstructor
@Service
public class TransactionalMessageManagementService {

   private final TransactionalMessageDao messageDao;
   private final TransactionalMessageContentDao contentDao;
   private final RabbitTemplate rabbitTemplate;

   private static final LocalDateTime END = LocalDateTime.of(2999, 1, 1, 0, 0, 0);
   private static final long DEFAULT_INIT_BACKOFF = 10L;
   private static final int DEFAULT_BACKOFF_FACTOR = 2;
   private static final int DEFAULT_MAX_RETRY_TIMES = 5;
   private static final int LIMIT = 100;

   public void saveTransactionalMessageRecord(TransactionalMessage record, String content) {
       record.setMessageStatus(TxMessageStatus.PENDING.getStatus());
       record.setNextScheduleTime(calculateNextScheduleTime(LocalDateTime.now(), DEFAULT_INIT_BACKOFF,
               DEFAULT_BACKOFF_FACTOR, 0));
       record.setCurrentRetryTimes(0);
       record.setInitBackoff(DEFAULT_INIT_BACKOFF);
       record.setBackoffFactor(DEFAULT_BACKOFF_FACTOR);
       record.setMaxRetryTimes(DEFAULT_MAX_RETRY_TIMES);
       messageDao.insertSelective(record);
       TransactionalMessageContent messageContent = new TransactionalMessageContent();
       messageContent.setContent(content);
       messageContent.setMessageId(record.getId());
       contentDao.insert(messageContent);
   }

   public void sendMessageSync(TransactionalMessage record, String content) {
       try {
           rabbitTemplate.convertAndSend(record.getExchangeName(), record.getRoutingKey(), content);
           if (log.isDebugEnabled()) {
               log.debug("Message sent successfully,Destination queue:{},Message content:{}", record.getQueueName(), content);
           }
           // Mark successful
           markSuccess(record);
       } catch (Exception e) {
           // Failed to mark
           markFail(record, e);
       }
   }

   private void markSuccess(TransactionalMessage record) {
       // Mark next execution time as maximum
       record.setNextScheduleTime(END);
       record.setCurrentRetryTimes(record.getCurrentRetryTimes().compareTo(record.getMaxRetryTimes()) >= 0 ?
               record.getMaxRetryTimes() : record.getCurrentRetryTimes() + 1);
       record.setMessageStatus(TxMessageStatus.SUCCESS.getStatus());
       record.setEditTime(LocalDateTime.now());
       messageDao.updateStatusSelective(record);
   }

   private void markFail(TransactionalMessage record, Exception e) {
       log.error("Failed to send message,Destination queue:{}", record.getQueueName(), e);
       record.setCurrentRetryTimes(record.getCurrentRetryTimes().compareTo(record.getMaxRetryTimes()) >= 0 ?
               record.getMaxRetryTimes() : record.getCurrentRetryTimes() + 1);
       // Calculate next execution time
       LocalDateTime nextScheduleTime = calculateNextScheduleTime(
               record.getNextScheduleTime(),
               record.getInitBackoff(),
               record.getBackoffFactor(),
               record.getCurrentRetryTimes()
       );
       record.setNextScheduleTime(nextScheduleTime);
       record.setMessageStatus(TxMessageStatus.FAIL.getStatus());
       record.setEditTime(LocalDateTime.now());
       messageDao.updateStatusSelective(record);
   }

   /**
    * Calculate next execution time
    *
    * @param base          Base time
    * @param initBackoff   Backoff reference value
    * @param backoffFactor Retreat index
    * @param round         Number of rounds
    * @return LocalDateTime
    */
   private LocalDateTime calculateNextScheduleTime(LocalDateTime base,
                                                   long initBackoff,
                                                   long backoffFactor,
                                                   long round) {
       double delta = initBackoff * Math.pow(backoffFactor, round);
       return base.plusSeconds((long) delta);
   }

   /**
    * Push compensation - the parameters should be customized according to the actual scene
    */
   public void processPendingCompensationRecords() {
       // The right value of time is the current time minus the initial value of backoff. In this case, the message just saved is pushed
       LocalDateTime max = LocalDateTime.now().plusSeconds(-DEFAULT_INIT_BACKOFF);
       // The left value of time is the right value minus 1 hour
       LocalDateTime min = max.plusHours(-1);
       Map<Long, TransactionalMessage> collect = messageDao.queryPendingCompensationRecords(min, max, LIMIT)
               .stream()
               .collect(Collectors.toMap(TransactionalMessage::getId, x -> x));
       if (!collect.isEmpty()) {
           StringJoiner joiner = new StringJoiner(",", "(", ")");
           collect.keySet().forEach(x -> joiner.add(x.toString()));
           contentDao.queryByMessageIds(joiner.toString())
                   .forEach(item -> {
                       TransactionalMessage message = collect.get(item.getMessageId());
                       sendMessageSync(message, item.getContent());
                   });
       }
   }
}

There is one thing to be optimized: the method of updating transaction message record status can be optimized to batch update. When the limit is large, the efficiency of batch update will be higher. Finally, the configuration class of scheduled tasks:

@Slf4j
@RequiredArgsConstructor
@Configuration
@EnableScheduling
public class ScheduleJobAutoConfiguration {

   private final TransactionalMessageManagementService managementService;

   /**
    * The local Redis is used here. In fact, it needs to be configured
    */
   private final RedissonClient redisson = Redisson.create();

   @Scheduled(fixedDelay = 10000)
   public void transactionalMessageCompensationTask() throws Exception {
       RLock lock = redisson.getLock("transactionalMessageCompensationTask");
       // The waiting time is 5 seconds, and it is expected that 300 seconds will be completed. These two values need to be customized according to the actual scenario
       boolean tryLock = lock.tryLock(5, 300, TimeUnit.SECONDS);
       if (tryLock) {
           try {
               long start = System.currentTimeMillis();
               log.info("Start transaction message push compensation timing task...");
               managementService.processPendingCompensationRecords();
               long end = System.currentTimeMillis();
               long delta = end - start;
               // In case the lock is released too early
               if (delta < 5000) {
                   Thread.sleep(5000 - delta);
               }
               log.info("Finish executing the transaction message push compensation timing task,time consuming:{} ms...", end - start);
           } finally {
               lock.unlock();
           }
       }
   }
}

After the basic code is written, the structure of the whole project is as follows:


Finally, add two test classes:

@RequiredArgsConstructor
@Component
public class MockBusine***unner implements CommandLineRunner {

   private final MockBusinessService mockBusinessService;

   @Override
   public void run(String... args) throws Exception {
       mockBusinessService.saveOrder();
   }
}

@Slf4j
@RequiredArgsConstructor
@Service
public class MockBusinessService {

   private final JdbcTemplate jdbcTemplate;
   private final TransactionalMessageService transactionalMessageService;
   private final ObjectMapper objectMapper;

   @Transactional(rollbackFor = Exception.class)
   public void saveOrder() throws Exception {
       String orderId = UUID.randomUUID().toString();
       BigDecimal amount = BigDecimal.valueOf(100L);
       Map<String, Object> message = new HashMap<>();
       message.put("orderId", orderId);
       message.put("amount", amount);
       jdbcTemplate.update("INSERT INTO t_order(order_id,amount) VALUES (?,?)", p -> {
           p.setString(1, orderId);
           p.setBigDecimal(2, amount);
       });
       String content = objectMapper.writeValueAsString(message);
       transactionalMessageService.sendTransactionalMessage(
               DefaultDestination.builder()
                       .exchangeName("tm.test.exchange")
                       .queueName("tm.test.queue")
                       .routingKey("tm.test.key")
                       .exchangeType(ExchangeType.DIRECT)
                       .build(),
               DefaultTxMessage.builder()
                       .businessKey(orderId)
                       .businessModule("SAVE_ORDER")
                       .content(content)
                       .build()
       );
       log.info("Save order:{}success...", orderId);
   }
}

The test results are as follows:


The simulated order data is saved successfully, and the RabbitMQ message is normally sent to the RabbitMQ server after the transaction is successfully submitted, as shown in the RabbitMQ console data.

Summary

The design of transaction message module only makes the function of asynchronous message push more complete. In fact, a reasonable asynchronous message interaction system will provide synchronous query interface, which is based on the characteristics of asynchronous message without callback or response.

Generally speaking, the throughput of a system is positively related to the proportion of asynchronous processing of the system (this can be referred to Amdahl's Law). Therefore, asynchronous interaction should be used as much as possible in the system architecture design practice to improve the system throughput and reduce the unnecessary waiting caused by synchronous blocking. The transaction message module can extend a background management, and even cooperate with Micrometer, Prometheus and Grafana systems for real-time data monitoring.

This demo project warehouse:
rabbit-transactional-message

In order to start demo normally, MySQL, Redis and RabbitMQ must be installed locally. A new database named local must be created locally.


Tags: Java RabbitMQ Spring MySQL JDBC

Posted on Wed, 27 May 2020 06:21:45 -0400 by zrosen88