JPA multi data source distributed transaction processing: two transaction schemes

preface

Transaction processing of multiple data sources is a common topic. Transaction management across two data sources is also a category of distributed transactions. In the same JVM, transactions of multiple data sources are processed. The classic processing scheme is JTA (java standard transaction abstraction based on XA Protocol Modeling) + XA(XA transaction protocol). Common JTA implementation frameworks include Atomikos, Bitronix, Narayana, which Spring has for this Some frameworks have component encapsulation, which can be used out of the box. In addition to sharing XA transaction scheme, this paper provides a new multi data source transaction solution and perspective.

Problem background

When dealing with the field desensitization of mysql, combined with the desensitization component function of sharding JDBC, in order to make sql compatible and minimize the application transformation, the blogger gives a field desensitization solution based on the integration of multiple data sources (only the operations including the desensitization field table go to sharding JDBC desensitization data source). This solution not only solves the problem, but also brings a new problem. The transaction of data source is independent, as I mentioned in this article JPA project multi data source mode integration sharding JDBC data desensitization In the context of spring, each data source corresponds to an independent transaction manager. The data source of the default transaction manager uses the data source of the business itself. Therefore, when using encrypted business, you need to specify the transaction manager name in the @ Transactional annotation as the transaction manager name corresponding to desensitization. There is no problem in using a simple business scenario in this way, but there is always a transaction covering the operations of two data sources in a general business scenario. At this time, it is not good to specify which transaction manager to use. so, a transaction manager with multiple data sources is needed here.

XA transaction scheme

Xa protocol uses 2pc (two-phase commit) to manage distributed transactions. Xa interface provides a standard interface for communication between resource manager and transaction manager. In the XA transaction related api abstraction of JDBC, the related interfaces are defined as follows

XADataSource, XA protocol data source

public interface XADataSource extends CommonDataSource {
  /**
   * Try to establish a physical database connection with the given user name and password. The returned connections can be used in distributed transactions
   */
  XAConnection getXAConnection() throws SQLException;
   //Omit non critical methods such as getLogWriter
 }

XAConnection

public interface XAConnection extends PooledConnection {

    /**
     * Retrieve a {@ code XAResource} object that the transaction manager will use to manage the transaction behavior of the {@ code XAConnection} object in distributed transactions
     */
    javax.transaction.xa.XAResource getXAResource() throws SQLException;
}

XAResource

public interface XAResource {
    /**
     * Commit the global transaction specified by xid
     */
    void commit(Xid xid, boolean onePhase) throws XAException;

    /**
     * Ends work performed on behalf of a transaction branch. The resource manager detaches the XA resource from the specified transaction branch and lets the transaction complete.
     */
    void end(Xid xid, int flags) throws XAException;

    /**
     * Notify transaction manager to ignore this xid transaction branch
     */
    void forget(Xid xid) throws XAException;

    /**
     * Determine whether the same Explorer
     */
    boolean isSameRM(XAResource xares) throws XAException;

    /**
     * Specify xid transaction preparation stage
     */
    int prepare(Xid xid) throws XAException;

    /**
     * Get a list of prepared transaction branches from resource manager. The transaction manager calls this method during recovery,
     * To get a list of transaction branches that are currently in preparation or preliminary completion state.
     */
    Xid[] recover(int flag) throws XAException;

    /**
     * Informs the resource manager to roll back work done on behalf of the transaction branch.
     */
    void rollback(Xid xid) throws XAException;

    /**
     * Starts work on behalf of the transaction branch specified in xid.
     */
    void start(Xid xid, int flags) throws XAException;

    //Omit non critical methods
}

Compared with ordinary transaction management, JDBC XA protocol manages an additional XAResource resource resource manager, which controls the XA transaction related behaviors (start, prepare, submit, rollback, and end). These are the internal behaviors of the framework, and the data source provided at the development level also becomes XADataSource. In the JTA abstraction, user transaction and transaction manager are defined. To use JTA transactions, these two interfaces must be implemented first. Therefore, if we want to use JTA+XA to control transactions of multiple data sources, take Atomikos as an example in the sprign boot,

Introducing Atomikos dependency

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-jta-atomikos</artifactId>
        </dependency>

spring boot has helped us define the XA transaction manager auto loading class, such as:

Create JTA transaction manager

@Configuration(proxyBeanMethods = false)
@EnableConfigurationProperties({ AtomikosProperties.class, JtaProperties.class })
@ConditionalOnClass({ JtaTransactionManager.class, UserTransactionManager.class })
@ConditionalOnMissingBean(PlatformTransactionManager.class)
class AtomikosJtaConfiguration {

	@Bean(initMethod = "init", destroyMethod = "shutdownWait")
	@ConditionalOnMissingBean(UserTransactionService.class)
	UserTransactionServiceImp userTransactionService(AtomikosProperties atomikosProperties,
			JtaProperties jtaProperties) {
		Properties properties = new Properties();
		if (StringUtils.hasText(jtaProperties.getTransactionManagerId())) {
			properties.setProperty("com.atomikos.icatch.tm_unique_name", jtaProperties.getTransactionManagerId());
		}
		properties.setProperty("com.atomikos.icatch.log_base_dir", getLogBaseDir(jtaProperties));
		properties.putAll(atomikosProperties.asProperties());
		return new UserTransactionServiceImp(properties);
	}
	@Bean(initMethod = "init", destroyMethod = "close")
	@ConditionalOnMissingBean(TransactionManager.class)
	UserTransactionManager atomikosTransactionManager(UserTransactionService userTransactionService) throws Exception {
		UserTransactionManager manager = new UserTransactionManager();
		manager.setStartupTransactionService(false);
		manager.setForceShutdown(true);
		return manager;
	}
	@Bean
	@ConditionalOnMissingBean(XADataSourceWrapper.class)
	AtomikosXADataSourceWrapper xaDataSourceWrapper() {
		return new AtomikosXADataSourceWrapper();
	}
	@Bean
	JtaTransactionManager transactionManager(UserTransaction userTransaction, TransactionManager transactionManager,
			ObjectProvider<TransactionManagerCustomizers> transactionManagerCustomizers) {
		JtaTransactionManager jtaTransactionManager = new JtaTransactionManager(userTransaction, transactionManager);
		transactionManagerCustomizers.ifAvailable((customizers) -> customizers.customize(jtaTransactionManager));
		return jtaTransactionManager;
	}
    ,,,,,,,,,,
}

Obviously, if you want to use XA transactions, you need to provide the implementation of UserTransaction and TransactionManager. There must also be an XADataSource, and the data source of sharding JDBC agent is DataSource. We need to package XADataSource as a normal DataSource. Spring has provided an XA DataSource wrapper for AtomikosXADataSourceWrapper, and it has been registered in the spring context in AtomikosJtaConfiguration, so we can directly customize the data source Inject the wrapper instance, and then, because it is a JPA environment, when creating an EntityManagerFactory instance, you need to specify the transaction management type of JPA as JTA. To sum up, the general business default data source configuration is as follows:

/**
 * @author: kl @kailing.pub
 * @date: 2020/5/18
 */
@Configuration
@EnableConfigurationProperties({JpaProperties.class, DataSourceProperties.class})
public class DataSourceConfiguration{

    @Primary
    @Bean
    public DataSource dataSource(AtomikosXADataSourceWrapper wrapper, DataSourceProperties dataSourceProperties) throws Exception {
        MysqlXADataSource dataSource = dataSourceProperties.initializeDataSourceBuilder().type(MysqlXADataSource.class).build();
        return wrapper.wrapDataSource(dataSource);
    }

    @Primary
    @Bean(initMethod = "afterPropertiesSet")
    public LocalContainerEntityManagerFactoryBean entityManagerFactory(JpaProperties jpaProperties, DataSource dataSource, EntityManagerFactoryBuilder factoryBuilder) {
        return factoryBuilder.dataSource(dataSource)
                .packages(Constants.BASE_PACKAGES)
                .properties(jpaProperties.getProperties())
                .persistenceUnit("default")
                .jta(true)
                .build();
    }

    @Bean
    @Primary
    public EntityManager entityManager(EntityManagerFactory entityManagerFactory){
        //The SharedEntityManager instance must be created using SharedEntityManagerCreator, otherwise the transaction in simplejprarepository will not take effect
        return SharedEntityManagerCreator.createSharedEntityManager(entityManagerFactory);
    }
}


Sharding JDBC encrypted data source and common business data source are actually the same data source. Only the data source that follows the encryption and decryption logic needs to be represented by the encryption component of sharding JDBC, plus the encryption and decryption processing logic. So the configuration is as follows:

/**
 * @author: kl @kailing.pub
 * @date: 2020/5/18
 */
@Configuration
@EnableConfigurationProperties({JpaProperties.class,SpringBootEncryptRuleConfigurationProperties.class, SpringBootPropertiesConfigurationProperties.class})
public class EncryptDataSourceConfiguration {

    @Bean
    public DataSource encryptDataSource(DataSource dataSource,SpringBootPropertiesConfigurationProperties props,SpringBootEncryptRuleConfigurationProperties encryptRule) throws SQLException {
        return EncryptDataSourceFactory.createDataSource(dataSource, new EncryptRuleConfigurationYamlSwapper().swap(encryptRule), props.getProps());
    }

    @Bean(initMethod = "afterPropertiesSet")
    public LocalContainerEntityManagerFactoryBean encryptEntityManagerFactory(@Qualifier("encryptDataSource") DataSource dataSource,JpaProperties jpaProperties, EntityManagerFactoryBuilder factoryBuilder) throws SQLException {
        return factoryBuilder.dataSource(dataSource)
                .packages(Constants.BASE_PACKAGES)
                .properties(jpaProperties.getProperties())
                .persistenceUnit("encryptPersistenceUnit")
                .jta(true)
                .build();
    }

    @Bean
    public EntityManager encryptEntityManager(@Qualifier("encryptEntityManagerFactory") EntityManagerFactory entityManagerFactory){
        //The SharedEntityManager instance must be created using SharedEntityManagerCreator, otherwise the transaction in simplejprarepository will not take effect
        return SharedEntityManagerCreator.createSharedEntityManager(entityManagerFactory);
    }
}

  • Problem 1 encountered: connection pool exhausted - try increasing 'maxpoolsize' and / or 'arrowconnectiontimeout' on the datasourcebean
  • Solution: the default data source connection pool initialized by AtomikosXADataSourceWrapper is 1, so you need to add configuration parameters such as:
spring.jta.atomikos.datasource.max-pool-size=20

  • Problem 2 encountered: XAER_INVAL: Invalid arguments (or unsupported command)
  • Solution: This is the bug for MySQL to implement XA. This problem only occurs when you visit the same MySQL database multiple times in the same transaction. Add the following parameters to the MySQL connection url, such as:
spring.datasource.url = jdbc:mysql://127.0.0.1:3306/xxx?pinGlobalTxToPhysicalConnection=true

Mysql XA transaction behavior

In this scenario, although there are multiple data sources, the underlying link is the same mysql database, so the XA transaction behavior is to start from the first executed sql (not JTA transaction begin stage), generate xid and XA START transactions, and then XA END. The second data source will judge whether the same mysql resource is used when executing the sql. If it is the same, use the newly generated xid to restart XA START RESUME, and then XA END. Although there are two datasources in the application layer, in fact, XA COMMIT will only be called once. The start of XAResource driven by mysql is as follows:

    public void start(Xid xid, int flags) throws XAException {
        StringBuilder commandBuf = new StringBuilder(MAX_COMMAND_LENGTH);
        commandBuf.append("XA START ");
        appendXid(commandBuf, xid);

        switch (flags) {
            case TMJOIN:
                commandBuf.append(" JOIN");
                break;
            case TMRESUME:
                commandBuf.append(" RESUME");
                break;
            case TMNOFLAGS:
                // no-op
                break;
            default:
                throw new XAException(XAException.XAER_INVAL);
        }
        dispatchCommand(commandBuf.toString());
        this.underlyingConnection.setInGlobalTx(true);
    }

In the first sql execution, flags=0, TMNOFLAGS logic is used. In the second sql execution, flags=134217728, TMRESUME is used, and transaction logic is reopened. The above is the real transaction logic of Mysql XA, but the blogger found that msyql xa does not support XA START RESUME, and there are many limitations Mysql XA transaction restrictions , so when using XA transaction in mysql database, it is better to understand the defects of mysql xa

Chain transaction scheme

Chained Transaction is not my first name. In the Transaction package of spring data common project, there is a default implementation of chained Transaction manager. In the previous article In depth understanding the working principle of @ Transactional in spring The transaction abstraction of Spring has been analyzed, which is composed of platform transaction manager, transaction status and transaction definition. Chained transaction manager also implements platform transaction manager and transaction status. The implementation principle is also very simple. The collection of transaction managers is maintained in chained transaction manager, and real transaction managers are arranged through agents. When transactions are opened, committed and rolled back, transactions in the collection are operated separately. To achieve unified management of multiple transactions. This scheme is simple and flawed. In the commit phase, if the exception does not occur in the first data source, the previous commit will not be rolled back. Therefore, when using chained transaction manager, try to put the transaction manager with high probability of failure behind the chain (turn on the transaction and commit the transaction in the opposite order). Here is just a new idea of multi data source transaction management, which can be managed with XA as much as possible.

The general default data source configuration for business is as follows:

/**
 * @author: kl @kailing.pub
 * @date: 2020/5/18
 */
@Configuration
@EnableConfigurationProperties({JpaProperties.class, DataSourceProperties.class})
public class DataSourceConfiguration{

    @Primary
    @Bean
    public DataSource dataSource(DataSourceProperties dataSourceProperties){
       return dataSourceProperties.initializeDataSourceBuilder().type(HikariDataSource.class).build();
    }

    @Primary
    @Bean(initMethod = "afterPropertiesSet")
    public LocalContainerEntityManagerFactoryBean entityManagerFactory(JpaProperties jpaProperties, DataSource dataSource, EntityManagerFactoryBuilder factoryBuilder) {
        return factoryBuilder.dataSource(dataSource)
                .packages(Constants.BASE_PACKAGES)
                .properties(jpaProperties.getProperties())
                .persistenceUnit("default")
                .build();
    }

    @Bean
    @Primary
    public EntityManager entityManager(EntityManagerFactory entityManagerFactory){
        //The SharedEntityManager instance must be created using SharedEntityManagerCreator, otherwise the transaction in simplejprarepository will not take effect
        return SharedEntityManagerCreator.createSharedEntityManager(entityManagerFactory);
    }

    @Primary
    @Bean
    public PlatformTransactionManager transactionManager(EntityManagerFactory entityManagerFactory){
        JpaTransactionManager txManager = new JpaTransactionManager();
        txManager.setEntityManagerFactory(entityManagerFactory);
        return txManager;
    }
}

The configuration of sharding JDBC encrypted data source is as follows:

/**
 * @author: kl @kailing.pub
 * @date: 2020/5/18
 */
@Configuration
@EnableConfigurationProperties({JpaProperties.class,SpringBootEncryptRuleConfigurationProperties.class, SpringBootPropertiesConfigurationProperties.class})
public class EncryptDataSourceConfiguration {

    @Bean
    public DataSource encryptDataSource(DataSource dataSource,SpringBootPropertiesConfigurationProperties props,SpringBootEncryptRuleConfigurationProperties encryptRule) throws SQLException {
        return EncryptDataSourceFactory.createDataSource(dataSource, new EncryptRuleConfigurationYamlSwapper().swap(encryptRule), props.getProps());
    }

    @Bean(initMethod = "afterPropertiesSet")
    public LocalContainerEntityManagerFactoryBean encryptEntityManagerFactory(@Qualifier("encryptDataSource") DataSource dataSource,JpaProperties jpaProperties, EntityManagerFactoryBuilder factoryBuilder) throws SQLException {
        return factoryBuilder.dataSource(dataSource)
                .packages(Constants.BASE_PACKAGES)
                .properties(jpaProperties.getProperties())
                .persistenceUnit("encryptPersistenceUnit")
                .build();
    }

    @Bean
    public EntityManager encryptEntityManager(@Qualifier("encryptEntityManagerFactory") EntityManagerFactory entityManagerFactory){
        //The SharedEntityManager instance must be created using SharedEntityManagerCreator, otherwise the transaction in simplejprarepository will not take effect
        return SharedEntityManagerCreator.createSharedEntityManager(entityManagerFactory);
    }

    @Bean
    public PlatformTransactionManager chainedTransactionManager(PlatformTransactionManager transactionManager) throws SQLException {
        JpaTransactionManager encryptTransactionManager = new JpaTransactionManager();
        encryptTransactionManager.setEntityManagerFactory(encryptEntityManagerFactory());
        //Using chain transaction manager to wrap real transaction manager and txManager transaction
        ChainedTransactionManager chainedTransactionManager = new ChainedTransactionManager(encryptTransactionManager,transactionManager);
        return chainedTransactionManager;
    }
}


With this scheme, when it comes to multi data source business, you need to specify which transaction manager to use, such as:

    @PersistenceContext(unitName = "encryptPersistenceUnit")
    private EntityManager entityManager;

    @PersistenceContext
    private EntityManager manager;

    @Transactional(transactionManager = "chainedTransactionManager")
    public AccountModel  save(AccountDTO dto){
        AccountModel accountModel = AccountMapper.INSTANCE.dtoTo(dto);

        entityManager.persist(accountModel);
        entityManager.flush();
        AccountModel accountMode2 = AccountMapper.INSTANCE.dtoTo(dto);

        manager.persist(accountMode2);
        manager.flush();

        return accountModel;
    }

epilogue

To sum up, for JPA's multi data source distributed transaction processing, JTA's transaction manager can be used out of the box after being encapsulated by spring boot. It is important to specify JTA transaction for EntityManagerFactory in JPA environment. In addition, this article shares a chain transaction arrangement method that can also be applied in this scenario, but the integrity of transactions cannot be guaranteed in special scenarios. Therefore, the blogger recommends using JTA transaction manager, and you can also try chained transaction manager if you have a suitable scenario.

About the author:

Chen Kailing joined Kaijing technology in May 2016. At present, he is the structure group manager and fire fighting team leader of Kaijing R & D center. Independent blog KL blog( http://www.kailing.pub )Blogger.

Tags: Programming MySQL Spring JDBC SQL

Posted on Thu, 11 Jun 2020 00:37:28 -0400 by ceruleansin