RLS based microservice Tenant Isolation solution

In the current software design, the solution of microservices is becoming more and more popular. Each microservice is responsible for a specific business. A large number of microservices cooperate and communicate with each other to form a complete system. Each microservice usually has its own business data and corresponding database. Generally, there are two design schemes for the data storage of each company (or tenant) in the database:

  1. Each tenant has its own Schema. Each Schema has a set of the same tables related to this business.

2. All tenants share a Schema. The data of each tenant is stored in the same table, and then each business table has a column (usually called TenantID) to indicate that the current row data belongs to that tenant.

The advantage of the first scheme is that the data between tenants is completely isolated. Users of a company can only access the data in their own database. The implementation principle is usually to obtain the Connection of the corresponding database according to the current user's company, which can only operate on their own company's Schema. The disadvantage is that the database will consume a lot of resources, and it is not easy to unify the upgrade management. The database needs to be upgraded one by one, and the maintenance manpower will be relatively large. Usually, this solution is suitable for large companies with large amount of data and high requirements for data security.

The advantage of the second scheme is relative to the disadvantage of the first scheme, that is, the database is upgraded and maintained uniformly. If a large number of tenants are small companies with small amount of data, it is undoubtedly very appropriate to put their data in the same table. The disadvantage is also obvious, that is, the same database connection is used. If no additional processing is done in the application layer, theoretically, the login user can access the data of different tenants.

The first scheme is the traditional tenant data storage scheme. Before the popularity of microservices, it widely existed in various enterprise oriented application software. The second scheme becomes more and more popular with the popularity of microservices, because each microservice is usually in the charge of a small team, and they usually don't have enough manpower to upgrade and maintain a set of Schema for each customer company.

The question is, if a multi tenant data storage scheme such as Share Schema is used, how to ensure the security of data access?

One way is to control data access in the application layer. In the design of popular spring boot based microservices in the industry, two database access frameworks are usually used. One is Mybatis, which is popular in domestic Internet companies. The other is JPA, which is used more abroad. As a standard, JPA has two popular implementations, one is hibernate and the other is eclipse link.

Unfortunately, mybatis itself does not provide a framework level solution for the multi tenant database of Share Schema. In other words, each developer must ensure that the sql statement he writes contains the filtering of the current company's Tenant ID, which has high requirements for developers. Of course, there are also some open source projects that add support for Share Schema on the basis of mybatis, such as mybatis plus( https://mp.baomidou.com/guide/tenant.html )However, it seems that he has more limitations.

JPA's standard includes support for database tables of Share Schema. It defines the column storing TenantID as Tenant Discriminator. The implementation of JPA needs to put the Tenant Discriminator in the final generated SQL statement, so as to ensure that different users use the same JQL and can only access the data of their own company. Unfortunately, only eclipse link currently has good support for this specification, and Hibernate, the most popular version, can not support it until version 7.0( https://hibernate.atlassian.net/browse/HHH-6054).

In this way, it seems that tenant data isolation in the database access layer of the application has become a difficult task.

Fortunately, many databases (except mysql unfortunately) provide a new data access control, RLS (Row Level Security) to be introduced in this article. Based on RLS, microservices can easily realize tenant data isolation.

Taking PostgreSQL as an example, let's assume that there is a product table under the public schema. For simplicity, this table only displays ID, NAME and TENANT_ID, creation time and update time.

-- create table
CREATE TABLE IF NOT EXISTS PRODUCT(
    ID UUID  PRIMARY KEY,
    NAME VARCHAR(256) UNIQUE NOT NULL,
    TENANT_ID VARCHAR(256) NOT NULL,
    CREATED_DATETIME TIMESTAMP NOT NULL,
    UPDATED_DATETIME TIMESTAMP NOT NULL
);

Then we open the RLS function of this table:

-- ALTER TABLE to enable row security
ALTER TABLE PRODUCT ENABLE ROW LEVEL SECURITY;

Create a product that controls access to the product table_ Policy, Product_ Policy limits only tenant_ The data of the ID column and the current session variable myapp.current_ Only those rows with the same tenant value can be accessed.

-- create policy
CREATE POLICY product_policy ON product
	FOR ALL
  USING (current_setting('myapp.current_tenant') = tenant_id)

Of course, we also need to create an ordinary user dev and grant him the right to access this table:

-- create normal user;
CREATE USER dev NOSUPERUSER PASSWORD 'dev';

-- grant privileges
GRANT ALL ON SCHEMA PUBLIC TO dev;
GRANT ALL ON TABLE PRODUCT TO dev;

In this way, the RLS function of the database has been opened and ready. The next step is to do in the application.

As we all know, in SpringBoot, if you need to access a database, you need to configure the database connection information in the application configuration file, and then spring boot will automatically configure us with an instance of DataSource and obtain the connection to the database from this instance.

#application.properties
spring.datasource.url=jdbc:postgresql://localhost:5432/testdb
spring.datasource.username=dev
spring.datasource.password=dev

We were defining product before_ Policy, we see that we refer to a myapp.current_ session variable of tenant. So when was this variable defined? The answer is every time you start a transaction. No matter the @ Transactional annotation or the Transactional Template of the Spring transaction management, they will call the getConnection method of the datasource instance, so we can customize a datasource. This datasource is a proxy of the datasource actually used, but it overrides the getConnection method, Before this method returns connection, set myapp.current_tenant gives the value of tenant to which the currently logged in user belongs.

// MultiTenantDataSource.java

import javax.sql.DataSource;
import java.io.PrintWriter;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.sql.SQLFeatureNotSupportedException;
import java.util.logging.Logger;

public class MultiTenantDataSource implements DataSource {
    private static final LoggingUtil logger = LoggingUtilFactory.getLogger(MultiTenantDataSource.class);

    private final DataSource delegate;

    private final TenantIDProvider tenantIDProvider;

    public MultiTenantDataSource(DataSource delegate, TenantIDProvider tenantIDProvider) {
        this.delegate = delegate;
        this.tenantIDProvider = tenantIDProvider;
    }

    @Override
    public Connection getConnection() throws SQLException {
        Connection connection = delegate.getConnection();
        enableTenantIsolation(connection);
        return connection;
    }

    @Override
    public Connection getConnection(String username, String password) throws SQLException {
        Connection connection = delegate.getConnection(username, password);
        enableTenantIsolation(connection);
        return connection;
    }

    @Override
    public <T> T unwrap(Class<T> iface) throws SQLException {
        return delegate.unwrap(iface);
    }

    @Override
    public boolean isWrapperFor(Class<?> iface) throws SQLException {
        return delegate.isWrapperFor(iface);
    }

    //The implementation of other Datasource methods is omitted here. They are basically the same as unwrap and iswrapperfor. They are all handled by delegate

    @Override
    public Logger getParentLogger() throws SQLFeatureNotSupportedException {
        return delegate.getParentLogger();
    }

    private void enableTenantIsolation(Connection connection) throws SQLException {
        String tenantID = tenantIDProvider.getTenantID();
        boolean originalAutoCommit = connection.getAutoCommit();
        try (
                PreparedStatement setTenantIDStatement = connection.prepareStatement("select set_config('myapp.current_tenant', ?, FALSE)")
        ) {
            logger.debug("MultiTenantDataSource::enableTenantIsolation", String.format("set current tenant to %s", tenantID));
            connection.setAutoCommit(false);
            setTenantIDStatement.setString(1, tenantID);
            setTenantIDStatement.execute();
            connection.commit();
        } catch (SQLException e) {
            connection.rollback();
            throw e;
        } finally {
            connection.setAutoCommit(originalAutoCommit);
        }
    }

}

We can see that our MultiTenantDataSource implements the DataSource interface. Before the two getConnection methods return, we call one Second hand mobile game account purchase EnableTenantIsolation method. This method only does one thing, that is, it obtains tenantID from the context of the current application through set_ The config statement assigns this tenantID to myapp.current_tenantSession variable.

Finally, we define an instance of this DataSource in SpringBoot:

//MultiTenantDataSourceConfiguration.java


import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.transaction.annotation.EnableTransactionManagement;

import javax.sql.DataSource;

@Configuration
@EnableTransactionManagement
public class MultiTenantDataSourceConfiguration {
    private static final LoggingUtil logger = LoggingUtilFactory.getLogger(MultiTenantDataSourceConfiguration.class);

    @Bean(name = "multiTenantDataSource")
    public DataSource multiTenantDataSource(
            @Value("${spring.datasource.url}") String jdbcUrl,
            @Value("${spring.datasource.driver-class-name}") String driver,
            @Value("${spring.datasource.username}") String userName,
            @Value("${spring.datasource.password}") String password,
            TenantIDProvider tenantIDProvider
    ) {
        try {
            org.apache.tomcat.jdbc.pool.DataSource pooledDataSource = new org.apache.tomcat.jdbc.pool.DataSource();
            pooledDataSource.setDriverClassName(driver);
            pooledDataSource.setUsername(userName);
            pooledDataSource.setPassword(password);
            pooledDataSource.setUrl(jdbcUrl);
            pooledDataSource.setInitialSize(5);
            pooledDataSource.setMaxActive(100);
            pooledDataSource.setMaxIdle(5);
            pooledDataSource.setMinIdle(2);

            MultiTenantDataSource multiTenantDataSource = new MultiTenantDataSource(pooledDataSource, tenantIDProvider);
            return multiTenantDataSource;
        } catch (Exception e) {
            logger.error("JdbcConfiguration::multiTenantDataSource",
                    "Cannot create the multiTenantDataSource due to " + e.getMessage());
            throw new MyAppException(e);
        }
    }

}

OK, now everything is ready. When the transaction is enabled, SpringBoot will call the getConnection method of our MultiTenantDataSource, and then set myapp.current in the current session_ The value of tenant. Applications, whether using Mybatis or JPA, do not need to explicitly specify tenant in sql statements or JPQL_ Filter criteria for ID column. All the data that each login user can operate must be the data of the company to which the current user belongs.

to update:

For the first time, I published an article on Zhihu, which was written in a hurry, and I had nothing to do at the time of the epidemic. I summarized my previous study and found that the number of people who collected was twice as much as that who liked: -). Thank you for your recognition here. I also urge you to praise and support originality when collecting. Thank you.

Tags: Java Spring Boot Microservices

Posted on Sun, 31 Oct 2021 03:48:44 -0400 by Dima