A reusable distributed transaction message architecture scheme based on RabbitMQ!

Author: Throwable
Blog Park: https://www.cnblogs.com/throwable/p/12266806.html

premise

Distributed transaction is a difficult problem in micro-service practice. In the micro-service practice schemes implemented by the author, all of them adopt compromise or avoid strong consistency.Referring to the local message table scheme proposed by Ebay many years ago, a low-invasive transaction message module is implemented based on RabbitMQ and MySQL (JDBC).The content of this article is to analyze the design idea and implementation of the whole scheme in detail.The environment depends on the following:

  • JDK1.8+
  • spring-boot-start-web:2.x.x
  • spring-boot-start-jdbc:2.x.x
  • spring-boot-start-amqp:2.x.x
  • HikariCP:3.x.x (spring-boot-start-jdbc included)
  • mysql-connector-java:5.1.48
  • redisson:3.12.1

Scheme Design Thought

Transaction messages are in principle only suitable for scenarios with weak consistency (or ultimate consistency), common scenarios with weak consistency such as:

  • The user service completed the registration action and pushed a marketing-related message to the SMS service.
  • In the credit system, the order service saves the order and pushes a record of the order pending approval to the approval service.
  • ......

Transaction messages should generally not be used in highly consistent scenarios.

In general, requiring strong consistency implies strict synchronization, which means that all operations must succeed or fail at the same time, which introduces the additional cost of synchronization.If a transactional message module is reasonably designed, the functions of compensation, query, monitoring, and so on, are all completed. Because system interaction is asynchronous, the overall throughput is higher than strict synchronization.In the business system that the author is responsible for, there is also a basic principle customized based on the use of transactional messages: if the message content is correct, the consumer needs to take care of itself when an exception occurs.

Simply put, the upstream guarantees its own business correctness and successfully pushes the correct message to RabbitMQ to assume that the upstream obligations have ended.

To make code less invasive, transaction messages require either Spring's programmatic or declarative transactions.Programmatic transactions typically rely on TransactionTemplate, while declarative transactions rely on AOP modules and on the annotation @Transactional.

Next, you need to customize a transactional messaging module and add a new transactional message record table, which is essentially a local message table, to hold each message record that needs to be sent.The main functions of the Transaction Message Module are:

  • Save message records.
  • Push message to RabbitMQ server.
  • Queries for message records, compensated push, and so on.

Logical unit of transaction execution

Within the logical unit of transaction execution, the preservation of transaction message records to be pushed is required, that is, local (business) logic and transaction message record preservation operations are bound to the same transaction.

Sending a message to the RabbitMQ service side needs to be deferred until the transaction commits in order to ensure that the transaction commits successfully and the message is successfully sent to the RabbitMQ service side.

In order to merge the two actions of saving the transactional message to be sent and sending the message to RabbitMQ into one action from a user-aware perspective, Spring-specific TransactionSynchronization is needed. Here, the callback location of the main method of transactional synchronization is analyzed, referring mainly to AbstractPlatformTransactionManager#commit() or AbstractPlatformTransactionManager#processCommit() method:

The figure above only demonstrates scenarios where transactions are correctly committed (without exceptions).It is clear here that both the transaction synchronizer TransactionSynchronization's afterCommit() and afterCompletion(int status) methods are callbacks after the true transaction commit point AbstractPlatformTransactionManager#doCommit(), so one of these two methods can be used to execute push messages to the RabbitMQ server, and the overall pseudocode is as follows:

@Transactional
public Dto businessMethod(){
    business transaction code block ...
    // Save Transaction Message
    [saveTransactionMessageRecord()]
    // Register Transaction Synchronizer - Push messages to RabbitMQ in the afterCommit() method
    [register TransactionSynchronization,send message in method afterCommit()]
    business transaction code block ...
}

In the pseudocode above, the two steps of saving the transaction message and registering the transaction synchronizer can be placed anywhere in the transaction method, that is, irrespective of the execution order.

Compensation for Transaction Messages

Although the author suggested that downstream services take care of their own service consumption abnormal scenarios, but sometimes forced to reload the corresponding messages upstream, this is a special scenario.Another scenario to consider is the failure of the afterCommit() method that triggers TransactionSynchronization after a transaction commits.This is a low-probability scenario, but it is bound to occur in production, and a more typical reason is that the TransactionSynchronization#afterCommit() method is restarted before the transaction commits and triggers the push service instance.

As shown in the following figure:

To unify the issue of compensated push, a finite state is used to determine if the message has been successfully pushed:

  • Within the transaction method, when a transaction message is saved, the push state of the tagged message record is in process.
  • In the implementation of the afterCommit() method of TransactionSynchronization, the transaction synchronizer interface pushes the corresponding message to RabbitMQ, then changes the status of the transaction message record to push success.

There is also a very special case where the RabbitMQ server itself fails, resulting in message push exceptions, in which retries (compensated push) are required. Experience has shown that retries in a short period of time are not meaningful, and the service fails usually does not recover instantaneously, so you can consider using an exponential backoff algorithm for retries while limiting the maximum number of retries.

Exponential value, interval value, and maximum retry limit need to be set according to the actual situation, otherwise it is easy to have problems such as too large message delay or too frequent retries.

Program implementation

Introducing core dependencies:

<properties>
    <spring.boot.version>2.2.4.RELEASE</spring.boot.version>
    <redisson.version>3.12.1</redisson.version>
    <mysql.connector.version>5.1.48</mysql.connector.version>
</properties>
<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-dependencies</artifactId>
            <version>${spring.boot.version}</version>
            <type>pom</type>
            <scope>import</scope>
        </dependency>
    </dependencies>
</dependencyManagement>
<dependencies>
    <dependency>
        <groupId>mysql</groupId>
        <artifactId>mysql-connector-java</artifactId>
        <version>${mysql.connector.version}</version>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-web</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-jdbc</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-aop</artifactId>
    </dependency>
    <dependency>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-amqp</artifactId>
    </dependency>
    <dependency>
        <groupId>org.redisson</groupId>
        <artifactId>redisson</artifactId>
        <version>${redisson.version}</version>
    </dependency>
</dependencies>

spring-boot-starter-jdbc, mysql-connector-java, and spring-boot-starter-aop are MySQL transaction related, while spring-boot-starter-amqp is encapsulated by RabbitMQ clients, and redisson mainly uses its distributed locks to compensate for lock execution of timed tasks (to prevent servicing multiple nodes and performing compensation push).

Table Design

The transaction message module mainly involves two tables. Take MySQL as an example, the table DDL is as follows:

CREATE TABLE `t_transactional_message`
(
    id                  BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    create_time         DATETIME        NOT NULL DEFAULT CURRENT_TIMESTAMP,
    edit_time           DATETIME        NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    creator             VARCHAR(20)     NOT NULL DEFAULT 'admin',
    editor              VARCHAR(20)     NOT NULL DEFAULT 'admin',
    deleted             TINYINT         NOT NULL DEFAULT 0,
    current_retry_times TINYINT         NOT NULL DEFAULT 0 COMMENT 'Current number of retries',
    max_retry_times     TINYINT         NOT NULL DEFAULT 5 COMMENT 'max retries',
    queue_name          VARCHAR(255)    NOT NULL COMMENT 'Queue name',
    exchange_name       VARCHAR(255)    NOT NULL COMMENT 'Exchange name',
    exchange_type       VARCHAR(8)      NOT NULL COMMENT 'Exchange Type',
    routing_key         VARCHAR(255) COMMENT 'Routing Key',
    business_module     VARCHAR(32)     NOT NULL COMMENT 'Business Modules',
    business_key        VARCHAR(255)    NOT NULL COMMENT 'Business Key',
    next_schedule_time  DATETIME        NOT NULL COMMENT 'Next Schedule Time',
    message_status      TINYINT         NOT NULL DEFAULT 0 COMMENT 'Message Status',
    init_backoff        BIGINT UNSIGNED NOT NULL DEFAULT 10 COMMENT 'Backoff Initialization Value,Unit in seconds',
    backoff_factor      TINYINT         NOT NULL DEFAULT 2 COMMENT 'Backoff factor(That is, the exponent)',
    INDEX idx_queue_name (queue_name),
    INDEX idx_create_time (create_time),
    INDEX idx_next_schedule_time (next_schedule_time),
    INDEX idx_business_key (business_key)
) COMMENT 'Transaction message table';

CREATE TABLE `t_transactional_message_content`
(
    id         BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
    message_id BIGINT UNSIGNED NOT NULL COMMENT 'Transaction message logging ID',
    content    TEXT COMMENT 'Message Content'
) COMMENT 'Transaction Message Content Table';

Since it is possible to extend this module to a background management module, store the message management and status-related fields and the bulk of the message content in two tables, respectively, to avoid the high IO usage of MySQL services when querying message records in large quantities (a reasonable solution after discussions with the DBA team of the previous company).Two business fields, business_module and business_key, are reserved for identifying business modules and business keys (typically unique identification numbers, such as order numbers).

In general, if a service declares the binding relationship between queue and exchanger in advance by configuring itself, then sending RabbitMQ messages depends only on the exchange Name and routingKey fields (header type switches are special and less used, not considered here for the time being), considering that the service may miss the declare operation, the message will be sent on the baseMake the first binding Declaration on the queue and cache the relevant information (the queue-exchanger binding declaration in RabbitMQ will not throw an exception as long as the parameters of the binding relationship are consistent each time you declare it).

Scheme Code Design

In the schema design description below, API designs for the messaging transaction management background are temporarily ignored, which can be supplemented later.

Define the model entity classes TransactionalMessage and TransactionalMessageContent:

@Data
public class TransactionalMessage {

    private Long id;
    private LocalDateTime createTime;
    private LocalDateTime editTime;
    private String creator;
    private String editor;
    private Integer deleted;
    private Integer currentRetryTimes;
    private Integer maxRetryTimes;
    private String queueName;
    private String exchangeName;
    private String exchangeType;
    private String routingKey;
    private String businessModule;
    private String businessKey;
    private LocalDateTime nextScheduleTime;
    private Integer messageStatus;
    private Long initBackoff;
    private Integer backoffFactor;
}

@Data
public class TransactionalMessageContent {

    private Long id;
    private Long messageId;
    private String content;
}

Then define the dao interface (the implementation details are not expanded here, MySQL is stored, and if you want to replace it with another type of database, you just need to use a different implementation):

public interface TransactionalMessageDao {

    void insertSelective(TransactionalMessage record);

    void updateStatusSelective(TransactionalMessage record);

    List<TransactionalMessage> queryPendingCompensationRecords(LocalDateTime minScheduleTime,
                                                               LocalDateTime maxScheduleTime,
                                                               int limit);
}

public interface TransactionalMessageContentDao {

    void insert(TransactionalMessageContent record);

    List<TransactionalMessageContent> queryByMessageIds(String messageIds);
}

Next, define the Transactional Message Service interface:

// Externally Provided Service Class Interfaces
public interface TransactionalMessageService {

    void sendTransactionalMessage(Destination destination, TxMessage message);
}


@Getter
@RequiredArgsConstructor
public enum ExchangeType {

    FANOUT("fanout"),

    DIRECT("direct"),

    TOPIC("topic"),

    DEFAULT(""),

    ;

    private final String type;
}

// Destination to send message
public interface Destination {

    ExchangeType exchangeType();

    String queueName();

    String exchangeName();

    String routingKey();
}

@Builder
public class DefaultDestination implements Destination {

    private ExchangeType exchangeType;
    private String queueName;
    private String exchangeName;
    private String routingKey;

    @Override
    public ExchangeType exchangeType() {
        return exchangeType;
    }

    @Override
    public String queueName() {
        return queueName;
    }

    @Override
    public String exchangeName() {
        return exchangeName;
    }

    @Override
    public String routingKey() {
        return routingKey;
    }
}

// Transaction message
public interface TxMessage {

    String businessModule();

    String businessKey();

    String content();
}

@Builder
public class DefaultTxMessage implements TxMessage {

    private String businessModule;
    private String businessKey;
    private String content;

    @Override
    public String businessModule() {
        return businessModule;
    }

    @Override
    public String businessKey() {
        return businessKey;
    }

    @Override
    public String content() {
        return content;
    }
}

// Message Status
@RequiredArgsConstructor
public enum TxMessageStatus {

    /**
     * Success
     */
    SUCCESS(1),

    /**
     * Pending Processing
     */
    PENDING(0),

    /**
     * Processing Failure
     */
    FAIL(-1),

    ;

    private final Integer status;
}

The implementation class of TransactionalMessageService is the core function implementation of transactional messages with the following code:

@Slf4j
@Service
@RequiredArgsConstructor
public class RabbitTransactionalMessageService implements TransactionalMessageService {

    private final AmqpAdmin amqpAdmin;
    private final TransactionalMessageManagementService managementService;

    private static final ConcurrentMap<String, Boolean> QUEUE_ALREADY_DECLARE = new ConcurrentHashMap<>();

    @Override
    public void sendTransactionalMessage(Destination destination, TxMessage message) {
        String queueName = destination.queueName();
        String exchangeName = destination.exchangeName();
        String routingKey = destination.routingKey();
        ExchangeType exchangeType = destination.exchangeType();
        // Predeclaration of Atomicity
        QUEUE_ALREADY_DECLARE.computeIfAbsent(queueName, k -> {
            Queue queue = new Queue(queueName);
            amqpAdmin.declareQueue(queue);
            Exchange exchange = new CustomExchange(exchangeName, exchangeType.getType());
            amqpAdmin.declareExchange(exchange);
            Binding binding = BindingBuilder.bind(queue).to(exchange).with(routingKey).noargs();
            amqpAdmin.declareBinding(binding);
            return true;
        });
        TransactionalMessage record = new TransactionalMessage();
        record.setQueueName(queueName);
        record.setExchangeName(exchangeName);
        record.setExchangeType(exchangeType.getType());
        record.setRoutingKey(routingKey);
        record.setBusinessModule(message.businessModule());
        record.setBusinessKey(message.businessKey());
        String content = message.content();
        // Save Transaction Message Record
        managementService.saveTransactionalMessageRecord(record, content);
        // Register Transaction Synchronizer
        TransactionSynchronizationManager.registerSynchronization(new TransactionSynchronizationAdapter() {
            @Override
            public void afterCommit() {
                managementService.sendMessageSync(record, content);
            }
        });
    }
}

The management of message record status and content persistence is centralized in the Transactional Message Management Service:

@Slf4j
@RequiredArgsConstructor
@Service
public class TransactionalMessageManagementService {

    private final TransactionalMessageDao messageDao;
    private final TransactionalMessageContentDao contentDao;
    private final RabbitTemplate rabbitTemplate;

    private static final LocalDateTime END = LocalDateTime.of(2999, 1, 1, 0, 0, 0);
    private static final long DEFAULT_INIT_BACKOFF = 10L;
    private static final int DEFAULT_BACKOFF_FACTOR = 2;
    private static final int DEFAULT_MAX_RETRY_TIMES = 5;
    private static final int LIMIT = 100;

    public void saveTransactionalMessageRecord(TransactionalMessage record, String content) {
        record.setMessageStatus(TxMessageStatus.PENDING.getStatus());
        record.setNextScheduleTime(calculateNextScheduleTime(LocalDateTime.now(), DEFAULT_INIT_BACKOFF,
                DEFAULT_BACKOFF_FACTOR, 0));
        record.setCurrentRetryTimes(0);
        record.setInitBackoff(DEFAULT_INIT_BACKOFF);
        record.setBackoffFactor(DEFAULT_BACKOFF_FACTOR);
        record.setMaxRetryTimes(DEFAULT_MAX_RETRY_TIMES);
        messageDao.insertSelective(record);
        TransactionalMessageContent messageContent = new TransactionalMessageContent();
        messageContent.setContent(content);
        messageContent.setMessageId(record.getId());
        contentDao.insert(messageContent);
    }

    public void sendMessageSync(TransactionalMessage record, String content) {
        try {
            rabbitTemplate.convertAndSend(record.getExchangeName(), record.getRoutingKey(), content);
            if (log.isDebugEnabled()) {
                log.debug("Send message successfully,Target Queue:{},Message Content:{}", record.getQueueName(), content);
            }
            // Mark Success
            markSuccess(record);
        } catch (Exception e) {
            // Marking Failure
            markFail(record, e);
        }
    }

    private void markSuccess(TransactionalMessage record) {
        // Mark next execution time as maximum
        record.setNextScheduleTime(END);
        record.setCurrentRetryTimes(record.getCurrentRetryTimes().compareTo(record.getMaxRetryTimes()) >= 0 ?
                record.getMaxRetryTimes() : record.getCurrentRetryTimes() + 1);
        record.setMessageStatus(TxMessageStatus.SUCCESS.getStatus());
        record.setEditTime(LocalDateTime.now());
        messageDao.updateStatusSelective(record);
    }

    private void markFail(TransactionalMessage record, Exception e) {
        log.error("Failed to send message,Target Queue:{}", record.getQueueName(), e);
        record.setCurrentRetryTimes(record.getCurrentRetryTimes().compareTo(record.getMaxRetryTimes()) >= 0 ?
                record.getMaxRetryTimes() : record.getCurrentRetryTimes() + 1);
        // Calculate next execution time
        LocalDateTime nextScheduleTime = calculateNextScheduleTime(
                record.getNextScheduleTime(),
                record.getInitBackoff(),
                record.getBackoffFactor(),
                record.getCurrentRetryTimes()
        );
        record.setNextScheduleTime(nextScheduleTime);
        record.setMessageStatus(TxMessageStatus.FAIL.getStatus());
        record.setEditTime(LocalDateTime.now());
        messageDao.updateStatusSelective(record);
    }

    /**
     * Calculate Next Execution Time
     *
     * @param base          Base Time
     * @param initBackoff   Backoff base value
     * @param backoffFactor Backoff Index
     * @param round         Number of rounds
     * @return LocalDateTime
     */
    private LocalDateTime calculateNextScheduleTime(LocalDateTime base,
                                                    long initBackoff,
                                                    long backoffFactor,
                                                    long round) {
        double delta = initBackoff * Math.pow(backoffFactor, round);
        return base.plusSeconds((long) delta);
    }

    /**
     * Push Compensation - The parameters inside should be customized to the actual scene
     */
    public void processPendingCompensationRecords() {
        // The right value of time is the current time minus the backoff initial value, so prevent pushing the message you just saved
        LocalDateTime max = LocalDateTime.now().plusSeconds(-DEFAULT_INIT_BACKOFF);
        // The left value of time is the right value minus one hour
        LocalDateTime min = max.plusHours(-1);
        Map<Long, TransactionalMessage> collect = messageDao.queryPendingCompensationRecords(min, max, LIMIT)
                .stream()
                .collect(Collectors.toMap(TransactionalMessage::getId, x -> x));
        if (!collect.isEmpty()) {
            StringJoiner joiner = new StringJoiner(",", "(", ")");
            collect.keySet().forEach(x -> joiner.add(x.toString()));
            contentDao.queryByMessageIds(joiner.toString())
                    .forEach(item -> {
                        TransactionalMessage message = collect.get(item.getMessageId());
                        sendMessageSync(message, item.getContent());
                    });
        }
    }
}

There is one thing to optimize here: The way to update the status of transaction message records can be optimized for bulk updates, which are more efficient when the limit is large.Finally, the configuration class for the timer task:

@Slf4j
@RequiredArgsConstructor
@Configuration
@EnableScheduling
public class ScheduleJobAutoConfiguration {

    private final TransactionalMessageManagementService managementService;

    /**
     * Here's the local Redis, actually configuring it
     */
    private final RedissonClient redisson = Redisson.create();

    @Scheduled(fixedDelay = 10000)
    public void transactionalMessageCompensationTask() throws Exception {
        RLock lock = redisson.getLock("transactionalMessageCompensationTask");
        // Wait time of 5 seconds, expected 300 seconds to complete, these two values need to be customized to the actual scene
        boolean tryLock = lock.tryLock(5, 300, TimeUnit.SECONDS);
        if (tryLock) {
            try {
                long start = System.currentTimeMillis();
                log.info("Start executing transaction message push compensation timer task...");
                managementService.processPendingCompensationRecords();
                long end = System.currentTimeMillis();
                long delta = end - start;
                // In case lock is released prematurely
                if (delta < 5000) {
                    Thread.sleep(5000 - delta);
                }
                log.info("Execute Transaction Message Push Compensation Timer Task Completed,time consuming:{} ms...", end - start);
            } finally {
                lock.unlock();
            }
        }
    }
}

The basic code is written and the entire project is structured as follows:

Finally, add two test classes:

@RequiredArgsConstructor
@Component
public class MockBusinessRunner implements CommandLineRunner {

    private final MockBusinessService mockBusinessService;

    @Override
    public void run(String... args) throws Exception {
        mockBusinessService.saveOrder();
    }
}

@Slf4j
@RequiredArgsConstructor
@Service
public class MockBusinessService {

    private final JdbcTemplate jdbcTemplate;
    private final TransactionalMessageService transactionalMessageService;
    private final ObjectMapper objectMapper;

    @Transactional(rollbackFor = Exception.class)
    public void saveOrder() throws Exception {
        String orderId = UUID.randomUUID().toString();
        BigDecimal amount = BigDecimal.valueOf(100L);
        Map<String, Object> message = new HashMap<>();
        message.put("orderId", orderId);
        message.put("amount", amount);
        jdbcTemplate.update("INSERT INTO t_order(order_id,amount) VALUES (?,?)", p -> {
            p.setString(1, orderId);
            p.setBigDecimal(2, amount);
        });
        String content = objectMapper.writeValueAsString(message);
        transactionalMessageService.sendTransactionalMessage(
                DefaultDestination.builder()
                        .exchangeName("tm.test.exchange")
                        .queueName("tm.test.queue")
                        .routingKey("tm.test.key")
                        .exchangeType(ExchangeType.DIRECT)
                        .build(),
                DefaultTxMessage.builder()
                        .businessKey(orderId)
                        .businessModule("SAVE_ORDER")
                        .content(content)
                        .build()
        );
        log.info("Save Order:{}Success...", orderId);
    }
}

The results of one test are as follows:

The simulated order data was saved successfully, and the RabbitMQ message was sent to the RabbitMQ server normally after the transaction was successfully submitted, as shown in the RabbitMQ console data.

Summary

The transactional message module is only designed to make the asynchronous message push function more complete. In fact, a reasonable asynchronous message interaction system will certainly provide a synchronous query interface, which is caused by the asynchronous message has no callback or no response.In general, there is a positive correlation between the throughput of a system and the proportion of asynchronous processing in the system (as can be seen from Amdahl's Law), so asynchronous interaction should be used as much as possible in system architecture design practice to improve system throughput while reducing the unnecessary wait caused by synchronous congestion.Transaction message module can be extended to a background management, and even real-time data monitoring can be done with Micrometer, Prometheus, and Grafana systems.

This demo project warehouse: rabbit-transactional-message

demo must have MySQL, Redis, and RabbitMQ installed locally to start properly. A new database named local must be created locally.

Tags: Programming RabbitMQ Spring MySQL JDBC

Posted on Thu, 06 Feb 2020 22:48:04 -0500 by bigfatpig